Recently in our organization management has decided to use Hadoop platform for new data analytics initiatives. Our early adoption use cases are in near real time, data archival, data staging and log data analytics area. For initial POC phase we have 10 node virtual machine clusters and started playing around with sqoop, flume along with pig and hive. Also as part of learning I have completed “Cloudera Certified Developer for Apache Hadoop (CCDH)” certification.
Since Hadoop is new in our organization we started from scratch
like setting up a directory structure, process for migration of code, etc. Directory
structure is needed in local unix file system as well as in HDFS, in local unix
file system directories are needed for software & codes and in HDFS its
needed for raw data, intermediate data and other configuration files. In Hadoop
eco system every other month we hear about some new software so it becomes very
important to have a proper directory structure.
Since Hadoop is new for most of us so we don’t really have much information so thought it would be good to write a blog and hopefully it will be
helpful for some of you. I will try to write a new topic as I learn something
new. Please do write your comment or feedback, I am new to blogging as well so
it’s going to be helpful for me to write a better blog.
So let’s get started what we did for directory structure. First in
local unix file system for software, we came up with the below directory structure
with the help of Unix Admin team. The software directory is under /usr/lib, for
now we have Hadoop, hive, pig, hbase, sqoop, flume, oozie, java, zookeeper,
hcatalog, etc. We will add new directory under /usr/lib whenever we get new
software.
Second directory structure is in local unix file
system for code like pig, hive, flume, sqoop, etc. We created our application
directory under /home and under application we have common, module1, module2. Under
these we have conf, lib, bin, util, and data. I have tried to explain below
about each directory.
- Application1: It’s good to have separate directories for each application.
- common: all the common libraries, config, scripts will be available under this directory.
- module1 & 2: If needed module specific directory can be created otherwise common can be used.
- conf: For all the properties or config files.
- lib: For all the external library, executable jar files.
- bin: For all the pig, hive, flume, sqoop scripts.
- util: For all the wrapper scripts.
- data: it’s placeholder for data for any processing purpose but technically we are not going to store any data files here.
Third directory structure is in HDFS for raw data, intermediate files, output files, metadata, etc. The below directories will be available in HDFS under /lob/application/module/.
- data: For all the input data files and the processed output files.
- work: For intermediate files generated during the workflow process.
- metadata: For all the metadata, schema, property files.
- archive: All the input and processed data will be moved to this, it will be done periodically.
Please let me know what you think about this blog.
This really makes sense to me. Thanks
ReplyDeleteUseful information , Thanks
ReplyDeleteVery useful information on hadoop
ReplyDeleteJava Training Institutes in Chennai | java j2ee training institutes in chennai | Java Training in Chennai | J2EE Training in Chennai | Java Course in Chennai
Excellent post. Big data is a term that portrays the substantial volume of information; both organized and unstructured that immerses a business on an everyday premise. To know more details please visit Big Data Training in Chennai | Primavera Training in Chennai
ReplyDeleteErzurum
ReplyDeleteistanbul
Ağrı
Malatya
Trabzon
MGQXJ3
ankara
ReplyDeletesakarya
tekirdağ
kastamonu
amasya
O3FB2M
61CC4
ReplyDeleteTunceli Parça Eşya Taşıma
Sakarya Lojistik
Kırşehir Parça Eşya Taşıma
Kırşehir Evden Eve Nakliyat
Düzce Parça Eşya Taşıma
Bilecik Şehirler Arası Nakliyat
Kastamonu Şehir İçi Nakliyat
Nevşehir Şehirler Arası Nakliyat
Tekirdağ Şehirler Arası Nakliyat
444EB
ReplyDeleteQlc Coin Hangi Borsada
Isparta Şehir İçi Nakliyat
Rize Şehirler Arası Nakliyat
Antep Şehirler Arası Nakliyat
Yobit Güvenilir mi
Kayseri Lojistik
Kırıkkale Şehirler Arası Nakliyat
Çerkezköy Bulaşık Makinesi Tamircisi
Kırklareli Evden Eve Nakliyat
550ED
ReplyDeleteArtvin Şehirler Arası Nakliyat
Çerkezköy Yol Yardım
Antep Parça Eşya Taşıma
Kars Parça Eşya Taşıma
Bolu Şehir İçi Nakliyat
Bitlis Lojistik
Antep Evden Eve Nakliyat
Çerkezköy Asma Tavan
Antalya Lojistik
A818A
ReplyDeleteBitlis Şehir İçi Nakliyat
Erzurum Şehir İçi Nakliyat
Ünye Parke Ustası
Kayseri Parça Eşya Taşıma
Kırşehir Lojistik
Silivri Duşa Kabin Tamiri
Kastamonu Şehirler Arası Nakliyat
Urfa Lojistik
Muş Şehirler Arası Nakliyat
027B5
ReplyDeleteGölbaşı Fayans Ustası
Çerkezköy Parke Ustası
Uşak Evden Eve Nakliyat
Çerkezköy Televizyon Tamircisi
Karapürçek Boya Ustası
Bybit Güvenilir mi
Pursaklar Fayans Ustası
Yenimahalle Fayans Ustası
Pursaklar Parke Ustası
A0298
ReplyDeleteKripto Para Kazma Siteleri
Paribu Borsası Güvenilir mi
Coin Nedir
Bitcoin Çıkarma Siteleri
Binance Ne Zaman Kuruldu
Bitcoin Üretme
Bitcoin Nasıl Alınır
Coin Para Kazanma
Kripto Para Madenciliği Nasıl Yapılır
1D5E6
ReplyDeletehttps://resimlimag.net/
0EE7B
ReplyDeletebinance referans kodu
resimli magnet
binance referans kodu
resimli magnet
resimli magnet
referans kimliği nedir
binance referans kodu
referans kimliği nedir
binance referans kodu
A2181
ReplyDeletereferans kimliği nedir
resimli magnet
binance referans kodu
referans kimliği nedir
binance referans kodu
resimli magnet
resimli magnet
binance referans kodu
binance referans kodu
DF06D
ReplyDeleteInstagram Beğeni Satın Al
Periscope Beğeni Satın Al
Caw Coin Hangi Borsada
Binance Referans Kodu
Bitcoin Giriş Nasıl Yapılır
Cate Coin Hangi Borsada
Bitcoin Çıkarma Siteleri
Likee App Beğeni Satın Al
Threads Beğeni Hilesi
49F43
ReplyDeleteparibu
4g proxy
bybit
binance 100 dolar
mercatox
bitexen
kripto para nasıl alınır
bitcoin seans saatleri
en güvenilir kripto borsası
6E1ED
ReplyDeletebitcoin haram mı
en güvenilir kripto borsası
kucoin
referans kodu binance
telegram kripto para grupları
kripto telegram grupları
bitget
okex
bitcoin giriş
A67A8
ReplyDeletetoptan mum
bitget
bitcoin nasıl kazanılır
bitexen
vindax
binance referans kod
en eski kripto borsası
btcturk
kripto kanalları telegram
0658E
ReplyDeleteBand Coin Yorum
Ocean Coin Yorum
Bitcoin Son Dakika
Ada Coin Yorum
Ont Coin Yorum
1inch Coin Yorum
Btrst Coin Yorum
Fil Coin Yorum
Mxc Coin Yorum
wfdsgfhbgfjhngj
ReplyDeleteصيانة افران جدة
GTJHYTJ
ReplyDeleteشركه تسليك مجاري بالاحساء
شركة مكافحة حشرات بالاحساء RqxCwr5lwL
ReplyDelete0FADC45140
ReplyDeletewhatsapp cam şov