I have typed this command to index a document in Elasticsearch
create an index
curl -X PUT \"localhost:9200/test_idx_1x\"
http://es-cn.medcl.net/tutorials/2011/07/18/attachment-type-in-action.html
#!/bin/sh
coded=`cat fn6742.pdf | perl -MMIME::Base64 -ne 'print encode_base64($_)'`
json="{\"file\":\"${coded}\"}"
echo "$json" > json.file
curl -X POST "localhost:9200/test/attachment/" -d @json.file
There is an alternative solution - plugin at http://elasticwarehouse.org. You can upload binary file using _ewupload?, read newly generated ID and update your different index with this reference.
Install plugin:
plugin -install elasticwarehouseplugin -u http://elasticwarehouse.org/elasticwarehouse/elasticsearch-elasticwarehouseplugin-1.2.2-1.7.0-with-dependencies.zip
Restart cluster, then:
curl -XPOST "http://127.0.0.1:9200/_ewupload?folder=/myfolder&filename=mybinaryfile.bin" --data-binary @mybinaryfile.bin
Sample response:
{"id":"nWvrczBcSEywHRBBBwfy2g","version":1,"created":true}
First, you don't specify whether you have the attachment
plugin installed. If not, you can do so with:
./bin/plugin -install mapper-attachments
You will need to restart ElasticSearch for it to load the plugin.
Then, as you do above, you map a field to have type attachment
:
curl -XPUT 'http://127.0.0.1:9200/foo/?pretty=1' -d '
{
"mappings" : {
"doc" : {
"properties" : {
"file" : {
"type" : "attachment"
}
}
}
}
}
'
When you try to index a document, you need to encode the contents of your file in Base64. You could do this on the command line using the base64
command line utility. However, to be legal JSON, you also need to encode new lines, which you can do by piping the output from base64
through Perl:
curl -XPOST 'http://127.0.0.1:9200/foo/doc?pretty=1' -d '
{
"file" : '`base64 /path/to/file | perl -pe 's/\n/\\n/g'`'
}
'
Now you can search your file:
curl -XGET 'http://127.0.0.1:9200/foo/doc/_search?pretty=1' -d '
{
"query" : {
"text" : {
"file" : "text to look for"
}
}
}
'
See ElasticSearch attachment type for more.
This is a complete shell script implementation:
file_path='/path/to/file'
file=$(base64 $file_path | perl -pe 's/\n/\\n/g')
curl -XPUT "http://eshost.com:9200/index/type/" -d '{
"file" : "content" : "'$file'"
}'