Basic Data Types in Python

Like other programming languages python has multiple data types to handle programming scenarios. These data types are builtin data types. Basic data types available in Python are listed below. Boolean Integer Complex Float String Data Type Symbol Notes Value Range Boolean bool Boolean data type uses when we represent conditional …

Read More »

Hive Installation on ubuntu

Steps to Install Hive on Ubuntu Step 1 Create a directory named Hive and download Hive tar file. Hive tar file can be downloaded using then wget as shown below wget http://apachemirror.wuchna.com/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz Step 2 The downloaded Hive tar file needs to be extracted using the tar command with –xvf option as …

Read More »

MapReduce Hello World Programming for Beginners

MapReduce is a Distributed computing programming model suitable for processing of huge data. Hadoop is capable of running MapReduce programs written in various languages: Java, Ruby, Python. MapReduce programs are parallel in nature, thus are very useful for performing large-scale data analysis using multiple machines in the cluster. Let’s write …

Read More »

Introduction to MapReduce

MapReduce is a Distributed computing programming model suitable for processing of huge data. Hadoop is capable of running MapReduce programs written in various languages: Java, Ruby, Python. MapReduce programs are parallel in nature, thus are very useful for performing large-scale data analysis using multiple machines in the cluster. MapReduce is …

Read More »

Introduction to Hadoop Architecture

Hadoop is an open source Distributed processing framework that manages data processing and storage for big data applications running in clustered environments. Hadoop Service Architecture HDFS(Hadoop Distributed File System) Overview Hadoop is normally deployed on a group of machines (Cluster) Each machine in cluster is node One of the node …

Read More »

Hadoop Installation

SINGLE-NODE [STANDALONE] CLUSTER INSTALLATION The report here will describe the required steps for setting up a single-node Hadoop cluster backed by the Hadoop Distributed File System, running on Ubuntu Linux Hadoop is a framework written in Java for running applications on large clusters of commodity hardware and incorporates features similar …

Read More »

Basic Syntax in Python

Identifiers in Python: An identifier is a user-defined word for a special purpose. An identifier is a string of alphanumeric characters that begins with an alphabetic character or an underscore character that are used to represent various programming elements such as variables, functions, arrays, structures, unions and so on. A …

Read More »