[Hadoop] VirtualBox 에 HDP Sandbox 환경 구축

Notice

좋은 하루되세요~~~~~~

Recent Posts

Recent Comments

Link

250x250

« 2026/08 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Tags more

Archives

Today

Total

관리 메뉴

헤르메스 LIFE

[Hadoop] VirtualBox 에 HDP Sandbox 환경 구축 본문

Spring Framework

[Hadoop] VirtualBox 에 HDP Sandbox 환경 구축

헤르메스의날개 2021. 12. 2. 01:15

728x90

Hadoop 강의을 들으면서 실습환경을 구성하는 내용을 기록하고자 했습니다.

하둡단일노드버전이 설치되어 있는 호튼웍스 샌드박스를 이용해서 실습환경을 구성할 예정입니다.

1. VirtualBox ( 6.1.3 Version )를 다운받아 설치합니다.

https://www.virtualbox.org/

Oracle VM VirtualBox

Welcome to VirtualBox.org! News Flash Important May 17th, 2021We're hiring! Looking for a new challenge? We're hiring a VirtualBox senior developer in 3D area (Europe/Russia/India). New November 22nd, 2021VirtualBox 6.1.30 released! Oracle today released a

www.virtualbox.org

2. Hortonworks Sandbox 를 다운로드 합니다.

https://www.cloudera.com/downloads/hortonworks-sandbox.html

Hortonworks Sandbox

Hortonworks Sandbox can help you get started learning, developing, testing and trying out new features on HDP and Cloudera DataFlow (Ambari).

www.cloudera.com

VirtualBox용 2.5.0 버전을 다운로드 했습니다. ( 용량이 10G 정도 되네요. )

3. VirtualBox에 호튼웍스 샌드박드 등록 ( HortonWorks Sandbox Import )

4. 실행하기

5. Ambari 접속 ( http://127.0.0.1:8080/#/login)

maria_dev/maria_dev

6. Moveielens 데이터 다운로드 ( https://grouplens.org/datasets/movielens/ )

1998년 4월 기준 1700개 영화, 1000명 관람객 기준, 10,000개 평점 Dataset 이라고 합니다.

7. Hive View

8. 데이터 업로드 ( 관람객 데이터, 영화 평점 데이터 )

톱니바퀴 선택 > Delimiter 를 '9 TAB' 를 선택 > Close ( 반드시 파일선택보다 먼저 실행해야 함.)

Upload Table > 파일선택 > u.data ( 관람객 데이터 ) 선택

Table name : udata 으로 변경

Upload Table 을 클릭하면 Upload Progress 팝업이 오픈되면서 데이터가 업로드 됩니다.

톱니바퀴 선택 > Delimiter 를 '124 |' 를 선택 > Close ( 반드시 파일선택보다 먼저 실행해야 함.)

Upload Table > 파일선택 > u.item ( 영화 평점 데이터 ) 선택

Table name : uItem 으로 변경

9. Query 실행

select movie_id, count(rating) as ratingcount
 from udata
 group by movie_id
order by ratingcount desc;

"영화에 대해 평점을 준 데이터가 많다는 의미는 영화가 그만큼 좋았다"라는 의미

728x90

'Spring Framework' 카테고리의 다른 글

Spring Boot 설정파일 암호화 (0)	2022.05.29
[QueryDSL] STS에서 Gradle 사용 시 QueryDSL QClass 생성 설정 (0)	2022.03.27
[Spring Boot] HikariCP를 이용한 Multi Database Connection 샘플 (0)	2021.07.19
[SpringBoot] H2 Database 연결하기 (0)	2021.07.17
[Logging] Log4j, Logback, Log4j2 로 MyBatis SQL 쿼리 남기기 (0)	2021.06.17

'Spring Framework' Related Articles

헤르메스 LIFE

[Hadoop] VirtualBox 에 HDP Sandbox 환경 구축 본문

[Hadoop] VirtualBox 에 HDP Sandbox 환경 구축

'Spring Framework' 카테고리의 다른 글

티스토리툴바