snappy | 易学教程

kafka消息压缩算法

阅读更多关于 kafka消息压缩算法

kafka是如何压缩消息的？要弄清楚这个问题，就要从kafka的消息格式说起。kafka的消息层次分为两层：消息集合(message set)以及消息(message)。一个消息集合包含若干条日志项(record item)，而日志项才是真正封装消息的地方。kafka底层的消息日志由一系列消息集合日志项组成。kafka通常不会直接操作具体的一条条消息，它总是在消息集合这个层面上进行写入操作。在kafka中，压缩可能会发生在两个地方：生产者端和broker端。生产者程序中配置compression.type参数即表示启动指定类型的压缩算法。 public class KafkaProduce { public void kafkaProducer() throws Exception { Properties pro = new Properties(); ...//其他配置参数 pro.put("partitioner.class", "kafka.KafkaPartitioner"); // 启用压缩 pro.put("compression.type", "gzip"); KafkaProducer config = new KafkaProducer(pro); } 它表明producer的压缩算法使用的是gzip

Kafka源码解读——动态加载class类,减少项目依赖包

阅读更多关于 Kafka源码解读——动态加载class类,减少项目依赖包

Java动态加载class类最常见的一种场景，某些功能服务项目里根本没有使用到，但是因为项目里引用了该依赖包的class类，所以你不得不在即使没有使用到该服务的情况下，仍让要添加依赖到项目中。但是通过动态加载class类，可以让你的项目大大减少第三方包的依赖。核心思路就是通过反射和基本的判断语句去控制对象的实例化。以Kafka源码为例：以下是kafka中的压缩模块代码 public Compressor(ByteBuffer buffer, CompressionType type) { this.type = type; this.initPos = buffer.position(); ...... appendStream = wrapForOutput(bufferStream, type, COMPRESSION_DEFAULT_BUFFER_SIZE); // kafka数据压缩函数 } // the following two functions also need to be public since they are used in MemoryRecords.iteration static public DataOutputStream wrapForOutput(ByteBufferOutputStream buffer,

How to install snappy C libraries on Windows 10 for use with python-snappy in Anaconda?

阅读更多关于 How to install snappy C libraries on Windows 10 for use with python-snappy in Anaconda?

问题 I want to install parquet for python using pip within an Anaconda 2 installation on Windows 10. While installing I ran into the error that is described here, the installer can't find snappy-c.h . There is no mention on how to install this on Windows in the answers. I downloaded the Snappy library from http://google.github.io/snappy/ and now I'm stuck. From my error message I would have assumed that the header files need to be in C:\Users\...\AppData\Local\Continuum\Anaconda2\include , but in

How to decompress the hadoop reduce output file end with snappy?

阅读更多关于 How to decompress the hadoop reduce output file end with snappy?

问题 Our hadoop cluster using snappy as default codec. Hadoop job reduce output file name is like part-r-00000.snappy . JSnappy fails to decompress the file bcz JSnappy requires the file start with SNZ. The reduce output file start with some bytes 0 somehow. How could I decompress the file? 回答1: Use "Hadoop fs -text" to read this file and pipe it to txt file. ex: hadoop fs -text part-r-00001.snappy > /tmp/mydatafile.txt 来源： https://stackoverflow.com/questions/19805149/how-to-decompress-the-hadoop

Why is parquet slower for me against text file format in hive?

阅读更多关于 Why is parquet slower for me against text file format in hive?

OK! So I decided to use Parquet as storage format for hive tables and before I actually implement it in my cluster, I decided to run some tests. Surprisingly, Parquet was slower in my tests as against the general notion that it is faster then plain text files. Please be noted that I am using Hive-0.13 on MapR Follows the flow of my operations Table A Format - Text Format Table size - 2.5 Gb Table B Format - Parquet Table size - 1.9 Gb [Create table B stored as parquet as select * from A] Table C Format - Parquet with snappy compression Table size - 1.9 Gb [Create table C stored as parquet

Comparison between lz4 vs lz4_hc vs blosc vs snappy vs fastlz

阅读更多关于 Comparison between lz4 vs lz4_hc vs blosc vs snappy vs fastlz

问题 I have a large file of size 500 mb to compress in a minute with the best possible compression ratio. I have found out these algorithms to be suitable for my use. lz4 lz4_hc snappy quicklz blosc Can someone give a comparison of speed and compression ratios between these algorithms? 回答1: Yann Collet's lz4, hands down. 回答2: This migth help you: (lz4 vs snappy) http://java-performance.info/performance-general-compression/ (benchmarks for lz4, snappy, lz4hc, blosc) https://web.archive.org/web

cassandra 1.2 fails to init snappy in freebsd

阅读更多关于 cassandra 1.2 fails to init snappy in freebsd

ERROR [WRITE-/10.10.35.30] 2013-06-19 23:15:56,907 CassandraDaemon.java (line 175) Exception in thread Thread[WRITE-/10.10.35.30,5,main] java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:79) at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:66) at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:341) at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:143) When going through know issues i found this The

on namespace ceilometer.$cmd failed: Authentication failed. 问题处理方案

阅读更多关于 on namespace ceilometer.$cmd failed: Authentication failed. 问题处理方案

on namespace ceilometer.$cmd failed: Authentication failed. UserNotFound: Could not find user ceilometer@ceilometer 背景介绍 1、 Ceilometer 项目是 OpenStack 中用来做计量计费功能的一个组件，后来又逐步发展增加了部分监控采集、告警的功能。 2、 MongoDB 是一个基于分布式文件存储的数据库。由 C++ 语言编写。旨在为 WEB 应用提供可扩展的高性能数据存储解决方案。 3、前几年的一个项目就使用到了 Ceilometer 和 MongoDB （ 3.2.9 版本）结合，用于存储性能和告警数据。问题说明最近，在某个现场环境上，MongoDB 挂载的存储设备出现了故障。但是，存储设备故障恢复后，MongoDB服务无法正常启动。启动日志报错如下： 2019-10-31T16:33:27.651+0800 I CONTROL [main] ***** SERVER RESTARTED ***** 2019-10-31T16:33:27.658+0800 I CONTROL [initandlisten] MongoDB starting : pid=5097 port=27017 dbpath=/var/lib/mongodb 64-bit

Hadoop集群搭建-03编译安装hadoop

阅读更多关于 Hadoop集群搭建-03编译安装hadoop

Hadoop集群搭建-05安装配置YARN Hadoop集群搭建-04安装配置HDFS Hadoop集群搭建-03编译安装hadoop Hadoop集群搭建-02安装配置Zookeeper Hadoop集群搭建-01前期准备 hadoop的编译和安装是直接在一台机器上搞得，姑且nn1机器。全程切换到root用户下操作 1.hadoop的一些资源在这里： https://www.lanzous.com/b849710/ 密码:9vui [hadoop@nn1 zk_op]$ su - root [root@nn1 ~]# mkdir /tmp/hadoop_c [root@nn1 ~]# cd /tmp/hadoop_c/ 用xshell的rz命令上传源码包到上面的目录。 [root@nn1 hadoop_c]# tar -xzf /tmp/hadoop_c/hadoop-2.7.3-src.tar.gz -C /usr/local/ yum安装一下乱七八糟要用到的软件和插件 yum -y install svn ncurses-devel gcc* lzo-devel zlib-devel autoconf automake libtool cmake openssl-devel bzip2 2.编译安装protobuf，谷歌的通信和存储协议，必须要用 [root@nn1 ~]#

Unable to run Snappy player on Beaglebone Black using Yocto Project

阅读更多关于 Unable to run Snappy player on Beaglebone Black using Yocto Project

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试): 问题: My main objective is to run snappy player ( https://wiki.gnome.org/Snappy ) on target machine ( BeagleBone Black ) so, I wrote a recipe for Snappy player( snappy_1.0.bb ) as below LICENSE = "GPLv2" LIC_FILES_CHKSUM = "file://COPYING;md5=686e6cb566fd6382c9fcc7a557bf4544" SRCREV = "e73fabce4c397b40d490c74f6a6a0de000804f42" SRC_URI = "git://git.gnome.org/snappy" S = "${WORKDIR}/git" RDEPENDS_${PN} = "gtk+3 gstreamer1.0 glib-2.0 clutter-1.0 gstreamer1.0-plugins-base libxtst clutter-gst-3.0 clutter-gtk-1.0 libx11 cairo gdk-pixbuf" # inherit line

订阅 snappy