Cheeky Solr Grabber

Published: August 3rd 2018

I had one of those moments where I needed to poll data out of Solr and realised that the nature of the query meant it was going to time out unless I ran it in batches. Due to the fact I'm rather averse to manual crap I'm busy at the moment with many spinning plates I thought I'd write a quick and dirty piece of code to do things for me in the background.

Yet more Bash Scripts!

#!/bin/bash

#  CHEEKY SOLR BATCHENATOR
#  Grabs info from a query, batches it up then concats to an output file

# Which Solr Server
echo -e "Solr Server IP?"
read solrServer

# Which Keyspace
echo -e "Keyspace?"
read keyspace

# Which Shard
echo -e "Which Shard?"
read shard

# Which Shard Replica
echo -e "Which Replica?"
read replica

# q Query
echo -e "What is the value of q?"
read qQuery
# fl Query
echo -e "What is the value of fl?"
read replica

# How many rows do you want to return
echo -e "How many items to loop over?"
read max
# Where do you want to begin from - 0 or where you left off, e.g. 1000
echo -e "Beginning from what row?"
read begin
# How many rows do you want to return in each iterative run? ( Batch size )
echo -e "How many rows per cycle?"
read rowsPerCycle
# Calculate how many batches there are
let "batchsize=max/rowsPerCycle"
batchNum = 1
# Undertake the query on the cluster
while [ $begin -le $max ]
do
    echo -e "Batch $batchNum of $batchsize"
  # Run the query against Solr
  curl -sS "http://$solrServer:8080/solr/$keyspace_$shard_$replica/select?q=$qQuery&fl=$flQuery&start=$begin&rows=$rowsPerCycle&wt=csv&indent=true" >>solr-output-curl.csv
  # Increment the counters
  let "begin=begin+rowsPerCycle"
  let "begin=begin+1"
  let "batchNum++"
done
# Exit successfully
exit 0


It's not mahoosively efficient in operation but did what was necessary for me to return the data I needed to. Of course, the query needs to be URL encoded and I pre-encoded the query in a Solr query interface before just shoving it directly in to the query section, but I've tried to break it out here in case it's of use.

Happy hacking!
gingerCoder()

Previous Item.. Bash Scripts And Command Hooks