Solr
Presented by Jacob Pitassi 07.28.12
Monday, July 30, 12
Solr Presented by Jacob Pitassi 07.28.12 Monday, July 30, 12 - - PowerPoint PPT Presentation
Solr Presented by Jacob Pitassi 07.28.12 Monday, July 30, 12 Hello Jacob Pitassi Director of Engineering jpitassi@stauffer.com stauffer.com Monday, July 30, 12 What are you talking about Monday, July 30, 12 Parts of this presentation
Presented by Jacob Pitassi 07.28.12
Monday, July 30, 12Hello
Jacob Pitassi Director of Engineering jpitassi@stauffer.com stauffer.com
Monday, July 30, 12Parts of this presentation
setup
Install
Setup using ubuntu apt-get install tomcat6 tomcat6-admin(Will install what you need) Download Solr 3.5 wget http://archive.apache.org/dist/lucene/solr/3.5.0/apache- solr-3.5.0.tgz (Solr 1.4 for Drupal 6 slow...Massive performance increase with 3.5 also geospatial search) tar -zxvf apache-solr-3.5.0.tgz Copy solr.war /var/lib/tomcat6/webapps Copy solr folder to /var/lib/tomcat6
Monday, July 30, 12Install
Create/edit /var/lib/tomcat6/conf/Catalina/localhost/solr.xml <?xml version="1.0" ?> <Context allowlinking="true" crosscontext="true" debug="0" docbase="/var/lib/tomcat6/webapps/solr.war" priviledged="true"> <Environment name="solr/home" override="true" type="java.lang.String" value="/var/lib/tomcat6/solr"></ Environment> </Context>
Default Solr did not work needed to goto solrconfig.xml
<queryResponseWriter name="velocity" class="solr.VelocityResponseWriter" enable="${solr.velocity.enabled:true}"/>
Plenty of Tutorials Google Drupal Apache Solr TADA!
Monday, July 30, 12Multicore
/var/lib/tomcat6/solr/solr.xml <cores adminPath="/admin/cores" defaultCoreName="collection1"> <core name="collection1" instanceDir="." /> <core name="test" instanceDir="/var/lib/tomcat6/solr/cores/test/"/> <core name="demo" instanceDir="/var/lib/tomcat6/solr/cores/ demo/"/> </cores> This is important when the site is a development box or multiple search forms.
Monday, July 30, 12Performance Tuning
HARDWARE HARDWARE HARDWARE Own server would increase performance Rackspace using internal IP addresses lets your Solr instance and server communicate in the same farm little latency because it can be a pig and consume all resources Going from 8GB to 16 GB of ram and a AMD dual core to intel quad can jump from >25 queries a min to ~100 queries a min
Monday, July 30, 12Performance Tuning
Tuning solr cache in solrconfig.xml localhost:8080/solr/admin/stats.jsp /var/lib/tomcat6/solr/conf/solrconfig.xml <!-- Query Result Cache Caches results of searches - ordered lists of document ids (DocList) based on a query, a sort, and the range of documents requested.
<queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> <!-- Document Cache Caches Lucene Document objects (the stored fields for each document). Since Lucene internal document ids are transient, this cache will not be autowarmed.
<documentCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> More examples located here: http://h3x.no/2011/05/10/guide-solr-performance-tuning
Monday, July 30, 12Security
Set up an admin username and password make the password strong(Still not all that secure) Ubuntu firewall UFW is pretty easy to configure and does a pretty decent job “Solr has no known cross-site scripting vulnerabilities. “ Cross-Site Forgery can be an issue, but DON’T CLICK SKETCHY LINKS on servers who have acces to your solr site. http://wiki.apache.org/solr/SolrSecurity/
Monday, July 30, 12Drupal 7
Search API or Apache Solr Integration Search comes with views support and facets out of the box. A lot of support for Drupal7. If you are picking up Drupal 7 and solr take a look. Coming from Drupal 6 Apache Solr Integration Very Recently changed hooks Currently 7.x-1.0-rc2
Monday, July 30, 12Apache Solr Integration
Download module: http://drupal.org/project/apachesolr Inside module copy conf files to your core conf folder. 3 v 1.4 Enable module(Duhh I know) Goto to settings add your core(Multicore built in) Also config your Drupal search to use solr: /admin/config/search/settings Index and cron
Monday, July 30, 12http://drupal.org/node/1571914
hook_apachesolr_field_mappings() hook_apachesolr_field_mappings_alter(&$mappings, $entity_type) hook_apachesolr_query_prepare($query) hook_apachesolr_field_name_map_alter(&$map) hook_apachesolr_query_alter($query) hook_apachesolr_delete_index_alter($query) hook_apachesolr_entity_info_alter(&$entity_info) hook_apachesolr_search_result_alter($document, &$extra, DrupalSolrQueryInterface $query) hook_apachesolr_process_results(&$results, DrupalSolrQueryInterface $query) hook_apachesolr_environment_delete($environment) hook_apachesolr_search_page_alter(&$build, $search_page) hook_apachesolr_search_types_alter(&$search_types) hook_apachesolr_index_document_build(ApacheSolrDocument $document, $entity, $entity_type, $env_id) hook_apachesolr_index_document_build_ENTITY_TYPE(ApacheSolrDocument $document, $entity, $env_id) hook_apachesolr_index_documents_alter(array &$documents, $entity, $entity_type,
Monday, July 30, 12Schema.xml
schema.XML field types <!-- For 2 and 3 letter prefix dynamic fields, the 1st letter indicates the data type and the last letter is 's' for single valued, 'm' for multi-valued --> <!-- We use long for integer since 64 bit ints are now common in PHP. --> <dynamicField name="is_*" type="long" indexed="true" stored="true" multiValued="false"/> <dynamicField name="im_*" type="long" indexed="true" stored="true" multiValued="true"/> <!-- List of booleans can be saved in a regular boolean field --> <dynamicField name="bm_*" type="boolean" indexed="true" stored="true" multiValued="true"/> <dynamicField name="bs_*" type="boolean" indexed="true" stored="true" multiValued="false"/> <!-- Regular text (without processing) can be stored in a string field--> <dynamicField name="ss_*" type="string" indexed="true" stored="true" multiValued="false"/> <dynamicField name="sm_*" type="string" indexed="true" stored="true" multiValued="true"/> <!-- Normal text fields are for full text - the relevance of a match depends on the length of the text --> <dynamicField name="ts_*" type="text" indexed="true" stored="true" multiValued="false" termVectors="true"/> <dynamicField name="tm_*" type="text" indexed="true" stored="true" multiValued="true" termVectors="true"/> ... <!-- External file fields --> <dynamicField name="eff_*" type="file"/> Plus many more check them out.
Monday, July 30, 12Drupal 7 Important modules to consider Apache Solr Autocomplete Apache Solr views Profile2 Apache Solr (for user information searching) Apache Solr Attachments (Just a cool module)
Monday, July 30, 12Api Docs
http://api.drupalhelp.net/api/apachesolr/classes/7 http://api.drupalhelp.net/api/apachesolr/ Solr_Base_Query.php/class/SolrBaseQuery/7
Monday, July 30, 12Search API
wget http://solr-php-client.googlecode.com/files/SolrPhpClient.r60.2011-05-04.tgz Get Modules: If you drush you will have: search_api search_api_facetapi search_api_views You will also need: search_api_solr Enable Modules(yeah ok) Copy Search API Solr schema.xml and solrconfig.xml to your cores conf folder Admin search settings create a search and an index let me show you. Unlike Apache Solr Integration You do not really have to play with Schema.xml, but go take a look if curious
Monday, July 30, 12GeoSpatial
http://ericlondon.com/posts/240-geospatial-apache-solr-searching-in-drupal-7-using- the-search-api-module-ubuntu-version
Monday, July 30, 12