###
DOI:
计算机系统应用英文版:2014,23(5):167-171
本文二维码信息
码上扫一扫!
基于HDFS的小文件存储与读取优化策略
(河北工业大学 计算机科学与软件学院, 天津 300401)
Optimizational Strategy of Small Files Stored and Readed on HDFS
(Computer Science and Software Engineering, Hebei University of Technology, Tianjin 300401, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 1597次   下载 3083
Received:October 04, 2013    Revised:October 29, 2013
中文摘要: 本文对HDFS分布式文件系统进行了深入的研究,在HDFS中以流式的方式访问大文件时效率很高但是对海量小文件的存取效率比较低. 本文针对这个问题提出了一个基于关系数据库的小文件合并策略,首先为每个用户建立一个用户文件,其次当用户上传小文件时把文件的元数据信息存入到关系数据库中并将文件追加写入到用户文件中,最后用户读取小文件时通过元数据信息直接以流式方式进行读取. 此外当用户读取小于一个文件块大小的文件时还采取了数据节点负载均衡策略,直接由存储数据的DataNode传送给客户端从而减轻主服务器压力提高文件传送效率. 实验结果表明通过此方案很好地解决了HDFS对大量小文件存取支持不足的缺点,提高了HDFS文件系统对海量小文件的读写性能,此方案适用于具有海量小文件的云存储系统,可以降低NameNode内存消耗提高文件读写效率.
中文关键词: HDFS  小文件优化  文件合并  负载均衡  云存储
Abstract:In this paper, the HDFS distributed file system is conducted in-depth research. In HDFS the way of streaming to read and write large files is very efficient, but the efficiency on reading and writing of the mass of small files is relatively low. According to this problem this paper presents a small files based on relational database consolidation strategy. Firstly creating a user's file for each user, then uploading file's metadata information to relational database and the file is written to the user's file when user uploads small files. Finally user via streaming mode to read small files according to the metadata information. When user reads file which size is smaller than the file block, datanode takes load balancing strategy, the datanode of storing data transfers data directly so as to reduce the pressure of the main server and improve the efficiency of file's transfer. The experimental results show that this scheme solves the shortcoming of HDFS reading and writing small files, improves the HDFS file system of reading and writing performance on massive small files. This scheme can apply to massive small files on cloud storage system, and reduce memory consumption of NameNode to improve the efficiency of file's reading and writing.
文章编号:     中图分类号:    文献标志码:
基金项目:
引用文本:
张海,马建红.基于HDFS的小文件存储与读取优化策略.计算机系统应用,2014,23(5):167-171
ZHANG Hai,MA Jian-Hong.Optimizational Strategy of Small Files Stored and Readed on HDFS.COMPUTER SYSTEMS APPLICATIONS,2014,23(5):167-171