Logstash and Treasure Data
Logstash is an open source software for log management, which is widely known and used as a part of the ELK stack. There's a great repository collection with many plugins for Logstash to collect, filter and store data from many source, and to many destinations, but it doesn't have a plugin to store data into Treasure Data Service.
So, I created it.
Repository on Github
Released gem
How to install / use the plugin
You can install that plugin very easily if you're already using Logstash.
$ cd /path/of/logstash $ bin/plugin install logstash-output-treasure_data Validating logstash-output-treasure_data Installing logstash-output-treasure_data Installation successful $
Next, configure Logstash with Treasure Data services. It requires name of database and table to be inserted, and API Key which can be checked on Treasure Data Console.
input { # ... } output { treasure_data { apikey => "0/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" database => "dbname" table => "tablename" } }
Then, launch Logstash with that configuration file.
$ bin/logstash -f your.conf
You'll get rows on Treasure Data console. Log message texts are stored in message column, and some additional columns will exist (ex: time, host and version).
Specifications / Configurations
Currently, this plugin store all logs into just a table. Use 2 or more sections for treasure_data in configuration file if you want to insert data into 2 or more tables.
And this plugin buffers data in memory buffer in 5 minutes (in longest case). Buffered data will be lost if Logstash process crashes. (For further option, see "Combination with Fluentd" section.)
This plugin have configuration options listed below. The plugin will work with default values for almost all cases, but some of these might help you under unstable network environments.
apikey (required)
database (required)
table (required)
auto_create_table [true]: plugin will create table if not exists on Treasure Data
endpoint [api.treasuredata.com]
use_ssl [true]
http_proxy [none]
connect_timeout [60s]
read_timeout [600s]
send_timeout [600s]
Combination with Fluentd
For now, logstash-output-treasure_data has very limited feature, especially for buffering, stored table specifications and performance.
There's an another option to use Fluentd for more flexible and high performance transferring. We can use logstash-output-fluentd to do it.
Github repository
Released gem
[host a]-> (logstash-output-fluentd) -> [host b]-> (logstash-output-fluentd) -> [fluentd] -> (fluent-plugin-td) -> [Treasure Data] [host c]-> (logstash-output-fluentd) ->
Many Logstash can be configured to send these logs to a Fluentd node, and that Fluentd stores whole data into Treasure Data.
# Configuration for Logastash input { # ... } output { fluentd { host => "your.host.name.example.com" port => 24224 # default tag => "td.database.tablename" } } # Configuration for Fluentd <source> @type forward port 24224 </source> <match td.*.*> @type tdlog apikey "0/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" buffer_path /mnt/fluentd/buffer/td </match>
Fluentd tdlog plugin can store data into many database-table combinations by parsing td.dbname.tablename. So you can configure any database/table pairs in Logstash configuration files if you want.
Conclusion
Now you can store your data very easily by using logstash-output-treasure_data or logstash-output-fluentd if you are already an user of Logstash. Happy logging!














