Skip to main content

experience sharing for real-time log indexing in Elasticsearch

Elasticsearch automatically makes type mapping with log data, but in the most of cases, the mapping data is not correct, so we should change them. this is my experience of that.

1) run elasticsearch

2) run logstash with configration

elk:/data1/elasticsearch/logstash-1.4.0> vi logstash-teton_runtime.conf&

input{
    file{
        codec => json
        path => ["/data1/elasticsearch/temp/*.log"]
        start_position => "end"
    }
}

output{
    elasticsearch {
        cluster => "locketCast"
        node_name => "logstash-teton_runtime"
        host => "xxx.xxx.xxx.xxx"
        index => "teton_runtime"
    }
}

elk:/data1/elasticsearch/logstash-1.4.0> bin/logstash -f mixpanel.conf

3) make log file in /data1/elasticsearch/temp/*.log

{"androidId":"91e8b0c3d89","facebookId":"11250","type":0,"numberOfMyFollowers":0,"externalBalance":0,"userType":"F","uniqueId":"dSTSbIg","id":326599,"profileImagePath":"http://graph.facebook.com/1125547890/picture?type=large&width=200&height=200","tos":true,"activeUser":1,"Action":"unlock","$gender":"unknown","event":"Impression_test","age":36,"name":"abcdef","Campaign_id":"tec_cus_9321","created_at":1400294301000,"gender":"unknown","udob":578600000,"active_timestamp":1404520448000,"longitude":"0.00000000","$age":"04/30/1970","os":"4.1.2","user_group":"2","status":"YET_TO_REQUEST","zipcode":"12345","cash_amount":0,"manu":"LGE","email":"marshall.s@yahoo.com","appVersion":"1.5.3","dob":"04/30/1988","latitude":"0.00000000"}

=> issue 1) : "type":0 => "type" can not be available, it should be changed.
error logs :
{"acknowledged":true}mac:/data1/elasticsearch/logstash-1.4.0> Failed to flush outgoing items {:outgoing_count=>1, :exception=>#<NameError: no method 'type' for arguments (org.jruby.RubyFixnum) on Java::OrgElasticsearchActionIndex::IndexRequest>, :backtrace=>["/data1/elasticsearch/logstash-1.4.0/lib/logstash/outputs/elasticsearch/protocol.rb:225:in
~~~
<NameError: no method 'type' for arguments (org.jruby.RubyFixnum) on Java::OrgElasticsearchActionIndex::IndexRequest>, :backtrace=>["/data1/elasticsearch/logstash-1.4.0/lib/logstash/outputs/elasticsearch/protocol.rb:225:in `build_request'", "/data1/elasticsearch/logstash-1.4.0/lib/logstash/outputs/elasticsearch/protocol.rb:205:in `bulk'", "org/jruby/RubyArray.java:1613:in `each'", "/data1/elasticsearch/logstash-1.4.0/lib/logstash/outputs/elasticsearch/protocol.rb:204:in `bulk'",

=> issue 2) : all of entries are not indexed, they're saved as a hole of string entry "message"

Solution )
step 1) check out the current type mapping

elk:/data1/elasticsearch> curl -XGET 'http://localhost:9200/teton_runtime/logs/_mapping'
{"teton_runtime":{"mappings":{"logs":{"properties":{"@timestamp":{"type":"date","format":"dateOptionalTime"},"@version":{"type":"string"},"androidId":{"type":"string"},"externalBalance":{"type":"long"},"facebookId":{"type":"string"},"host":{"type":"string"},"id":{"type":"long"},"message":{"type":"string"},"numberOfMyFollowers":{"type":"long"},"path":{"type":"string"},"profileImagePath":{"type":"string","store":true},"uniqueId":{"type":"string"},"userType":{"type":"string"}}}}}}mac:/data1/elasticsearch/logstash-1.4.0>

step 2) make new type mapping considering with current type mapping
curl -XPUT 'http://localhost:9200/teton_runtime/logs/_mapping' -d '
{
"logs" : {
        "properties" : {
"facebookId" : {"type" : "string"},
"androidId" : {"type" : "string"},
"numberOfMyFollowers" : {"type" : "long"},
"externalBalance" : {"type" : "long"},
"userType" : {"type" : "string"},
"uniqueId" : {"type" : "string"},
"id" : {"type" : "long"},
"profileImagePath" : {"type" : "string", "store" : true},
"tos" : {"type" : "boolean"},
"activeUser" : {"type" : "string"},
"Action" : {"type" : "string"},
"$gender" : {"type" : "string"},
"event" : {"type" : "string"},
"age" : {"type" : "integer"},
"name" : {"type" : "string"},
"Campaign_id" : {"type" : "string"},
"created_at" : {"type" : "date"},
"gender" : {"type" : "string"},
"udob" : {"type" : "date"},
"active_timestamp" : {"type" : "string"},
"longitude" : {"type" : "string"},
"$age" : {"type" : "date"},
"os" : {"type" : "string"},
"user_group" : {"type" : "string"},
"status" : {"type" : "string"},
"zipcode" : {"type" : "string"},
"cash_amount" : {"type" : "long"},
"manu" : {"type" : "string"},
"email" : {"type" : "string"},
"appVersion" : {"type" : "string"},
"dob" : {"type" : "date"},
"latitude" : {"type" : "string"}
        }
    }
}
'

step 3) delete current index
curl -XDELETE 'http://localhost:9200/teton_runtime'

step 4) append log file in /data1/elasticsearch/temp/*.log with last empty line

ex)
curl -XGET 'http://localhost:9200/impression/logs/_mapping'

curl -XDELETE 'http://localhost:9200/impression/logs/_mapping'

curl -XPUT 'http://localhost:9200/impression/logs/_mapping' -d '
{
"logs" : {
        "properties" : {
"carrier" : {"type" : "string", "index": "not_analyzed"}
        }
    }
}
'

* http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/


Comments

Popular posts from this blog

DevOps JD's required skills from LinkedIn

From some of DevOps JD on linkedIn, I realised that DeveOps should be the leader of the organization. https://docs.google.com/spreadsheets/d/1P520nH0pYcAdN0rJcnMQqsgu9cV9GdknztJ92J8l7-s/pubhtml DevOps' Required Skills From LinkedIn on 8/30/16: DevOps should be the leader of the company! Yahoo Netflix Samsung Salesforce Fortinet SUM OS admin UNIX systems Unix platforms Linux administrator Linux VMs Docker VMs VMware, OpenStack, Hyper-V Openstack, KVM, VMWare Version control version control systems Git, SVN Cloud Amazon AWS AWS AWS, Azure DB MySql Oracle, MySQL, NoSQL Mysql administration and strong command of SQL MySQL RabbitMQ MySql, MongoDB, Redis, Oracle, ProgreSQL N/W TCP/IP networking, DNS, HTTP NAS Understanding of network stack, network tuning, subnet/VLANs. HAProxy, DNS, IPTable Script Lang Shell, Perl, Python, Ruby, PHP bash Python, Bash/tcsh a scripting language: Perl, Python and Unix Shell preferred Python, Perl, Ruby Python, Ruby, Shell, PHP Web LAMP stack

Ubuntu GUI with VNC on Xenserver

Xenserver 에서 Ubuntu GUI 를 쓰기 위해서는 VNC 가 답인 듯... Installing Ubuntu Gnome GUI on Ubuntu Server 12.10 with VNC Update Repositories # apt-get update Install gnome and vnc: # apt-get install gnome-core vnc4server Start VNC Server: # vncserver (You’ll then be prompted to create and verify a new VNC connect password) Kill the currently running VNC Session: # vncserver -kill :1 Edit VNC startup config file: # vim .vnc/xstartup Uncomment the following line: unset SESSION_MANAGER Add the following line: gnome-session --session=gnome-classic & Comment Out the following two lines: x-terminal-emulator -geometry 1280x1024+10+10 -ls -title "$VNCDESKTOP Desktop" & x-window-manager & End result should look like: #!/bin/sh # Uncomment the following two lines for normal desktop: unset SESSION_MANAGER # exec /etc/X11/xinit/xinitrc gnome-session --session=gnome-classic & [ -x /etc/vnc/xstartup ] && exec /etc/vnc/xstartup [ -r $HOME/

Install CoreOs on linode without VM

Install CoreOs on linode without VM 1. Add a Linode 2. Create a new Disk   CoreOS 3. Rescue > Reboot into Rescue Mode 4. Remote Access   Launch Lish Console 5. make an install script cat <<'EOF1' > install.sh # add needed package sudo apt-get update sudo apt-get install -y curl wget whois sudo apt-get install -y ca-certificates #sudo apt-get install gawk -y # get discovery url discoveryUrl=`curl https://discovery.etcd.io/new` # write cloud-config.yml cat <<EOF2 > cloud-config.yml #cloud-config users:   - name: core     groups:       - sudo       - docker coreos:   etcd:     name: node01     discovery: $discoveryUrl hostname: node01 EOF2 # get the coreos installation script #wget https://raw.github.com/coreos/init/master/bin/coreos-install wget https://raw.githubusercontent.com/coreos/init/master/bin/coreos-install # run installation chmod 755 coreos-install sudo ./coreos-install \       -d /dev/sda \       -