 
              Solr Presented by Jacob Pitassi 07.28.12 Monday, July 30, 12
Hello Jacob Pitassi Director of Engineering jpitassi@stauffer.com stauffer.com Monday, July 30, 12
What are you talking about Monday, July 30, 12
Parts of this presentation • Install Apache Solr With a Multicore setup • Solr Performance hints • Solr Security • Solr schema.xml • Apache Solr Integration Module • Search API Monday, July 30, 12
Install Monday, July 30, 12
Install Setup using ubuntu apt-get install tomcat6 tomcat6-admin(Will install what you need) Download Solr 3.5 wget http://archive.apache.org/dist/lucene/solr/3.5.0/apache- solr-3.5.0.tgz (Solr 1.4 for Drupal 6 slow...Massive performance increase with 3.5 also geospatial search) tar -zxvf apache-solr-3.5.0.tgz Copy solr.war /var/lib/tomcat6/webapps Copy solr folder to /var/lib/tomcat6 Monday, July 30, 12
Install Create/edit /var/lib/tomcat6/conf/Catalina/localhost/solr.xml <?xml version="1.0" ?> <Context allowlinking="true" crosscontext="true" debug="0" docbase="/var/lib/tomcat6/webapps/solr.war" priviledged="true"> <Environment name="solr/home" override="true" type="java.lang.String" value="/var/lib/tomcat6/solr"></ Environment> </Context> Default Solr did not work needed to goto solrconfig.xml <queryResponseWriter name="velocity" class="solr.VelocityResponseWriter" enable="${solr.velocity.enabled: true }"/> Plenty of Tutorials Google Drupal Apache Solr TADA! Monday, July 30, 12
Multicore /var/lib/tomcat6/solr/solr.xml <cores adminPath="/admin/cores" defaultCoreName="collection1"> <core name="collection1" instanceDir="." /> <core name="test" instanceDir="/var/lib/tomcat6/solr/cores/test/"/> <core name="demo" instanceDir="/var/lib/tomcat6/solr/cores/ demo/"/> </cores> This is important when the site is a development box or multiple search forms. Monday, July 30, 12
Performance Monday, July 30, 12
Performance Tuning HARDWARE HARDWARE HARDWARE Own server would increase performance Rackspace using internal IP addresses lets your Solr instance and server communicate in the same farm little latency because it can be a pig and consume all resources Going from 8GB to 16 GB of ram and a AMD dual core to intel quad can jump from >25 queries a min to ~100 queries a min Monday, July 30, 12
Performance Tuning Tuning solr cache in solrconfig.xml localhost:8080/solr/admin/stats.jsp /var/lib/tomcat6/solr/conf/solrconfig.xml <!-- Query Result Cache Caches results of searches - ordered lists of document ids (DocList) based on a query, a sort, and the range of documents requested. --> <queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> <!-- Document Cache Caches Lucene Document objects (the stored fields for each document). Since Lucene internal document ids are transient, this cache will not be autowarmed. --> <documentCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> More examples located here: http://h3x.no/2011/05/10/guide-solr-performance-tuning Monday, July 30, 12
Security Monday, July 30, 12
Security Set up an admin username and password make the password strong(Still not all that secure) Ubuntu firewall UFW is pretty easy to configure and does a pretty decent job “Solr has no known cross-site scripting vulnerabilities. “ Cross-Site Forgery can be an issue, but DON’T CLICK SKETCHY LINKS on servers who have acces to your solr site. http://wiki.apache.org/solr/SolrSecurity/ Monday, July 30, 12
Schema Monday, July 30, 12
Apache Solr Integration Monday, July 30, 12
Drupal 7 Search API or Apache Solr Integration Search comes with views support and facets out of the box. A lot of support for Drupal7. If you are picking up Drupal 7 and solr take a look. Coming from Drupal 6 Apache Solr Integration Very Recently changed hooks Currently 7.x-1.0-rc2 Monday, July 30, 12
Apache Solr Integration Download module: http://drupal.org/project/apachesolr Inside module copy conf files to your core conf folder. 3 v 1.4 Enable module(Duhh I know) Goto to settings add your core(Multicore built in) Also config your Drupal search to use solr: /admin/config/search/settings Index and cron Monday, July 30, 12
http://drupal.org/node/1571914 hook_apachesolr_field_mappings() hook_apachesolr_field_mappings_alter(&$mappings, $entity_type) hook_apachesolr_query_prepare($query) hook_apachesolr_field_name_map_alter(&$map) hook_apachesolr_query_alter($query) hook_apachesolr_delete_index_alter($query) hook_apachesolr_entity_info_alter(&$entity_info) hook_apachesolr_search_result_alter($document, &$extra, DrupalSolrQueryInterface $query) hook_apachesolr_process_results(&$results, DrupalSolrQueryInterface $query) hook_apachesolr_environment_delete($environment) hook_apachesolr_search_page_alter(&$build, $search_page) hook_apachesolr_search_types_alter(&$search_types) hook_apachesolr_index_document_build(ApacheSolrDocument $document, $entity, $entity_type, $env_id) hook_apachesolr_index_document_build_ENTITY_TYPE(ApacheSolrDocument $document, $entity, $env_id) hook_apachesolr_index_documents_alter(array &$documents, $entity, $entity_type, Monday, July 30, 12
Schema.xml schema.XML field types <!-- For 2 and 3 letter prefix dynamic fields, the 1st letter indicates the data type and the last letter is 's' for single valued, 'm' for multi-valued --> <!-- We use long for integer since 64 bit ints are now common in PHP. --> <dynamicField name="is_*" type="long" indexed="true" stored="true" multiValued="false"/> <dynamicField name="im_*" type="long" indexed="true" stored="true" multiValued="true"/> <!-- List of booleans can be saved in a regular boolean field --> <dynamicField name="bm_*" type="boolean" indexed="true" stored="true" multiValued="true"/> <dynamicField name="bs_*" type="boolean" indexed="true" stored="true" multiValued="false"/> <!-- Regular text (without processing) can be stored in a string field--> <dynamicField name="ss_*" type="string" indexed="true" stored="true" multiValued="false"/> <dynamicField name="sm_*" type="string" indexed="true" stored="true" multiValued="true"/> <!-- Normal text fields are for full text - the relevance of a match depends on the length of the text --> <dynamicField name="ts_*" type="text" indexed="true" stored="true" multiValued="false" termVectors="true"/> <dynamicField name="tm_*" type="text" indexed="true" stored="true" multiValued="true" termVectors="true"/> ... <!-- External file fields --> <dynamicField name="eff_*" type="file"/> Plus many more check them out. Monday, July 30, 12
Drupal 7 Important modules to consider Apache Solr Autocomplete Apache Solr views Profile2 Apache Solr (for user information searching) Apache Solr Attachments (Just a cool module) Monday, July 30, 12
Api Docs http://api.drupalhelp.net/api/apachesolr/classes/7 http://api.drupalhelp.net/api/apachesolr/ Solr_Base_Query.php/class/SolrBaseQuery/7 Monday, July 30, 12
Search API Monday, July 30, 12
Search API wget http://solr-php-client.googlecode.com/files/SolrPhpClient.r60.2011-05-04.tgz Get Modules: If you drush you will have: search_api search_api_facetapi search_api_views You will also need: search_api_solr Enable Modules(yeah ok) Copy Search API Solr schema.xml and solrconfig.xml to your cores conf folder Admin search settings create a search and an index let me show you. Unlike Apache Solr Integration You do not really have to play with Schema.xml, but go take a look if curious Monday, July 30, 12
GeoSpatial http://ericlondon.com/posts/240-geospatial-apache-solr-searching-in-drupal-7-using- the-search-api-module-ubuntu-version Monday, July 30, 12
Questions? Monday, July 30, 12
Thank You! Monday, July 30, 12
Recommend
More recommend