J'ai un soucis quand j'essai d'ouvrir un fichier .mp4 avec webhdfs sur un container. J'utilise l'image bde2020 d'Hadoop (https://github.com/big-data-europe/docker-hadoop). Quand je fais : http://localhost:9870/webhdfs/v1/pat...=GETFILESTATUS, j'ai bien un retour comme quoi le fichier est la, avec les autorisations de lecture. Mais quand je fais http://localhost:9870/webhdfs/v1/pat...eo.mp4?op=OPEN, ça charge pendant un long moment, puis ça redirige vers : http://id_container_datanode:9864/we...:9000&offset=0 avec une page erreur. Je comprends pas pourquoi j'obtiens une réponse avec GETFILESTATUS et pas avec OPEN (j'ai bien les codecs pour lire des fichiers mp4)
Voici mon code, quasiment rien changer par rapport au dépot git, juste ajouter un dossier share entre mon reseau local et le docker :
docker-compose.yml
Code yaml : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64 version: "3" services: namenode: image: bde2020/hadoop-namenode:2.0.0-hadoop3.2.1-java8 container_name: namenode restart: always ports: - 9870:9870 - 9000:9000 volumes: - hadoop_namenode:/hadoop/dfs/name - share:/share:consistent environment: - CLUSTER_NAME=test env_file: - ./hadoop.env datanode: image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8 container_name: datanode restart: always volumes: - hadoop_datanode:/hadoop/dfs/data environment: SERVICE_PRECONDITION: "namenode:9870" env_file: - ./hadoop.env resourcemanager: image: bde2020/hadoop-resourcemanager:2.0.0-hadoop3.2.1-java8 container_name: resourcemanager restart: always environment: SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode:9864" env_file: - ./hadoop.env nodemanager1: image: bde2020/hadoop-nodemanager:2.0.0-hadoop3.2.1-java8 container_name: nodemanager restart: always environment: SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode:9864 resourcemanager:8088" env_file: - ./hadoop.env historyserver: image: bde2020/hadoop-historyserver:2.0.0-hadoop3.2.1-java8 container_name: historyserver restart: always environment: SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode:9864 resourcemanager:8088" volumes: - hadoop_historyserver:/hadoop/yarn/timeline env_file: - ./hadoop.env volumes: hadoop_namenode: hadoop_datanode: hadoop_historyserver: share: external: true
hadoop.env
Code ini : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43 CORE_CONF_fs_defaultFS=hdfs://namenode:9000 CORE_CONF_hadoop_http_staticuser_user=root CORE_CONF_hadoop_proxyuser_hue_hosts=* CORE_CONF_hadoop_proxyuser_hue_groups=* CORE_CONF_io_compression_codecs=org.apache.hadoop.io.compress.SnappyCodec HDFS_CONF_dfs_webhdfs_enabled=true HDFS_CONF_dfs_permissions_enabled=false HDFS_CONF_dfs_namenode_datanode_registration_ip___hostname___check=false YARN_CONF_yarn_log___aggregation___enable=true YARN_CONF_yarn_log_server_url=http://historyserver:8188/applicationhistory/logs/ YARN_CONF_yarn_resourcemanager_recovery_enabled=true YARN_CONF_yarn_resourcemanager_store_class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore YARN_CONF_yarn_resourcemanager_scheduler_class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler YARN_CONF_yarn_scheduler_capacity_root_default_maximum___allocation___mb=8192 YARN_CONF_yarn_scheduler_capacity_root_default_maximum___allocation___vcores=4 YARN_CONF_yarn_resourcemanager_fs_state___store_uri=/rmstate YARN_CONF_yarn_resourcemanager_system___metrics___publisher_enabled=true YARN_CONF_yarn_resourcemanager_hostname=resourcemanager YARN_CONF_yarn_resourcemanager_address=resourcemanager:8032 YARN_CONF_yarn_resourcemanager_scheduler_address=resourcemanager:8030 YARN_CONF_yarn_resourcemanager_resource__tracker_address=resourcemanager:8031 YARN_CONF_yarn_timeline___service_enabled=true YARN_CONF_yarn_timeline___service_generic___application___history_enabled=true YARN_CONF_yarn_timeline___service_hostname=historyserver YARN_CONF_mapreduce_map_output_compress=true YARN_CONF_mapred_map_output_compress_codec=org.apache.hadoop.io.compress.SnappyCodec YARN_CONF_yarn_nodemanager_resource_memory___mb=16384 YARN_CONF_yarn_nodemanager_resource_cpu___vcores=8 YARN_CONF_yarn_nodemanager_disk___health___checker_max___disk___utilization___per___disk___percentage=98.5 YARN_CONF_yarn_nodemanager_remote___app___log___dir=/app-logs YARN_CONF_yarn_nodemanager_aux___services=mapreduce_shuffle MAPRED_CONF_mapreduce_framework_name=yarn MAPRED_CONF_mapred_child_java_opts=-Xmx4096m MAPRED_CONF_mapreduce_map_memory_mb=4096 MAPRED_CONF_mapreduce_reduce_memory_mb=8192 MAPRED_CONF_mapreduce_map_java_opts=-Xmx3072m MAPRED_CONF_mapreduce_reduce_java_opts=-Xmx6144m MAPRED_CONF_yarn_app_mapreduce_am_env=HADOOP_MAPRED_HOME=/opt/hadoop-3.2.1/ MAPRED_CONF_mapreduce_map_env=HADOOP_MAPRED_HOME=/opt/hadoop-3.2.1/ MAPRED_CONF_mapreduce_reduce_env=HADOOP_MAPRED_HOME=/opt/hadoop-3.2.1/
Dockerfile for namenode
Code Dockerfile : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16 FROM bde2020/hadoop-base:2.0.0-hadoop3.2.1-java8 MAINTAINER Ivan Ermilov <ivan.s.ermilov@gmail.com> HEALTHCHECK CMD curl -f http://localhost:9870/ || exit 1 ENV HDFS_CONF_dfs_namenode_name_dir=file:///hadoop/dfs/name RUN mkdir -p /hadoop/dfs/name VOLUME /hadoop/dfs/name ADD run.sh /run.sh RUN chmod a+x /run.sh EXPOSE 9870 CMD ["/run.sh"]
Merci d'avance pour vos éclaircissement
Partager