Solr Presented by Jacob Pitassi 07.28.12 Monday, July 30, 12 - - PowerPoint PPT Presentation

solr
SMART_READER_LITE
LIVE PREVIEW

Solr Presented by Jacob Pitassi 07.28.12 Monday, July 30, 12 - - PowerPoint PPT Presentation

Solr Presented by Jacob Pitassi 07.28.12 Monday, July 30, 12 Hello Jacob Pitassi Director of Engineering jpitassi@stauffer.com stauffer.com Monday, July 30, 12 What are you talking about Monday, July 30, 12 Parts of this presentation


slide-1
SLIDE 1

Solr

Presented by Jacob Pitassi 07.28.12

Monday, July 30, 12
slide-2
SLIDE 2

Hello

Jacob Pitassi Director of Engineering jpitassi@stauffer.com stauffer.com

Monday, July 30, 12
slide-3
SLIDE 3

What are you talking about

Monday, July 30, 12
slide-4
SLIDE 4

Parts of this presentation

  • Install Apache Solr With a Multicore

setup

  • Solr Performance hints
  • Solr Security
  • Solr schema.xml
  • Apache Solr Integration Module
  • Search API
Monday, July 30, 12
slide-5
SLIDE 5

Install

Monday, July 30, 12
slide-6
SLIDE 6

Install

Setup using ubuntu apt-get install tomcat6 tomcat6-admin(Will install what you need) Download Solr 3.5 wget http://archive.apache.org/dist/lucene/solr/3.5.0/apache- solr-3.5.0.tgz (Solr 1.4 for Drupal 6 slow...Massive performance increase with 3.5 also geospatial search) tar -zxvf apache-solr-3.5.0.tgz Copy solr.war /var/lib/tomcat6/webapps Copy solr folder to /var/lib/tomcat6

Monday, July 30, 12
slide-7
SLIDE 7

Install

Create/edit /var/lib/tomcat6/conf/Catalina/localhost/solr.xml <?xml version="1.0" ?> <Context allowlinking="true" crosscontext="true" debug="0" docbase="/var/lib/tomcat6/webapps/solr.war" priviledged="true"> <Environment name="solr/home" override="true" type="java.lang.String" value="/var/lib/tomcat6/solr"></ Environment> </Context>

Default Solr did not work needed to goto solrconfig.xml

<queryResponseWriter name="velocity" class="solr.VelocityResponseWriter" enable="${solr.velocity.enabled:true}"/>

Plenty of Tutorials Google Drupal Apache Solr TADA!

Monday, July 30, 12
slide-8
SLIDE 8

Multicore

/var/lib/tomcat6/solr/solr.xml <cores adminPath="/admin/cores" defaultCoreName="collection1"> <core name="collection1" instanceDir="." /> <core name="test" instanceDir="/var/lib/tomcat6/solr/cores/test/"/> <core name="demo" instanceDir="/var/lib/tomcat6/solr/cores/ demo/"/> </cores> This is important when the site is a development box or multiple search forms.

Monday, July 30, 12
slide-9
SLIDE 9

Performance

Monday, July 30, 12
slide-10
SLIDE 10

Performance Tuning

HARDWARE HARDWARE HARDWARE Own server would increase performance Rackspace using internal IP addresses lets your Solr instance and server communicate in the same farm little latency because it can be a pig and consume all resources Going from 8GB to 16 GB of ram and a AMD dual core to intel quad can jump from >25 queries a min to ~100 queries a min

Monday, July 30, 12
slide-11
SLIDE 11

Performance Tuning

Tuning solr cache in solrconfig.xml localhost:8080/solr/admin/stats.jsp /var/lib/tomcat6/solr/conf/solrconfig.xml <!-- Query Result Cache Caches results of searches - ordered lists of document ids (DocList) based on a query, a sort, and the range of documents requested.

  • ->

<queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> <!-- Document Cache Caches Lucene Document objects (the stored fields for each document). Since Lucene internal document ids are transient, this cache will not be autowarmed.

  • ->

<documentCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> More examples located here: http://h3x.no/2011/05/10/guide-solr-performance-tuning

Monday, July 30, 12
slide-12
SLIDE 12

Security

Monday, July 30, 12
slide-13
SLIDE 13

Security

Set up an admin username and password make the password strong(Still not all that secure) Ubuntu firewall UFW is pretty easy to configure and does a pretty decent job “Solr has no known cross-site scripting vulnerabilities. “ Cross-Site Forgery can be an issue, but DON’T CLICK SKETCHY LINKS on servers who have acces to your solr site. http://wiki.apache.org/solr/SolrSecurity/

Monday, July 30, 12
slide-14
SLIDE 14

Schema

Monday, July 30, 12
slide-15
SLIDE 15

Apache Solr Integration

Monday, July 30, 12
slide-16
SLIDE 16

Drupal 7

Search API or Apache Solr Integration Search comes with views support and facets out of the box. A lot of support for Drupal7. If you are picking up Drupal 7 and solr take a look. Coming from Drupal 6 Apache Solr Integration Very Recently changed hooks Currently 7.x-1.0-rc2

Monday, July 30, 12
slide-17
SLIDE 17

Apache Solr Integration

Download module: http://drupal.org/project/apachesolr Inside module copy conf files to your core conf folder. 3 v 1.4 Enable module(Duhh I know) Goto to settings add your core(Multicore built in) Also config your Drupal search to use solr: /admin/config/search/settings Index and cron

Monday, July 30, 12
slide-18
SLIDE 18

http://drupal.org/node/1571914

hook_apachesolr_field_mappings() hook_apachesolr_field_mappings_alter(&$mappings, $entity_type) hook_apachesolr_query_prepare($query) hook_apachesolr_field_name_map_alter(&$map) hook_apachesolr_query_alter($query) hook_apachesolr_delete_index_alter($query) hook_apachesolr_entity_info_alter(&$entity_info) hook_apachesolr_search_result_alter($document, &$extra, DrupalSolrQueryInterface $query) hook_apachesolr_process_results(&$results, DrupalSolrQueryInterface $query) hook_apachesolr_environment_delete($environment) hook_apachesolr_search_page_alter(&$build, $search_page) hook_apachesolr_search_types_alter(&$search_types) hook_apachesolr_index_document_build(ApacheSolrDocument $document, $entity, $entity_type, $env_id) hook_apachesolr_index_document_build_ENTITY_TYPE(ApacheSolrDocument $document, $entity, $env_id) hook_apachesolr_index_documents_alter(array &$documents, $entity, $entity_type,

Monday, July 30, 12
slide-19
SLIDE 19

Schema.xml

schema.XML field types <!-- For 2 and 3 letter prefix dynamic fields, the 1st letter indicates the data type and the last letter is 's' for single valued, 'm' for multi-valued --> <!-- We use long for integer since 64 bit ints are now common in PHP. --> <dynamicField name="is_*" type="long" indexed="true" stored="true" multiValued="false"/> <dynamicField name="im_*" type="long" indexed="true" stored="true" multiValued="true"/> <!-- List of booleans can be saved in a regular boolean field --> <dynamicField name="bm_*" type="boolean" indexed="true" stored="true" multiValued="true"/> <dynamicField name="bs_*" type="boolean" indexed="true" stored="true" multiValued="false"/> <!-- Regular text (without processing) can be stored in a string field--> <dynamicField name="ss_*" type="string" indexed="true" stored="true" multiValued="false"/> <dynamicField name="sm_*" type="string" indexed="true" stored="true" multiValued="true"/> <!-- Normal text fields are for full text - the relevance of a match depends on the length of the text --> <dynamicField name="ts_*" type="text" indexed="true" stored="true" multiValued="false" termVectors="true"/> <dynamicField name="tm_*" type="text" indexed="true" stored="true" multiValued="true" termVectors="true"/> ... <!-- External file fields --> <dynamicField name="eff_*" type="file"/> Plus many more check them out.

Monday, July 30, 12
slide-20
SLIDE 20

Drupal 7 Important modules to consider Apache Solr Autocomplete Apache Solr views Profile2 Apache Solr (for user information searching) Apache Solr Attachments (Just a cool module)

Monday, July 30, 12
slide-21
SLIDE 21

Api Docs

http://api.drupalhelp.net/api/apachesolr/classes/7 http://api.drupalhelp.net/api/apachesolr/ Solr_Base_Query.php/class/SolrBaseQuery/7

Monday, July 30, 12
slide-22
SLIDE 22

Search API

Monday, July 30, 12
slide-23
SLIDE 23

Search API

wget http://solr-php-client.googlecode.com/files/SolrPhpClient.r60.2011-05-04.tgz Get Modules: If you drush you will have: search_api search_api_facetapi search_api_views You will also need: search_api_solr Enable Modules(yeah ok) Copy Search API Solr schema.xml and solrconfig.xml to your cores conf folder Admin search settings create a search and an index let me show you. Unlike Apache Solr Integration You do not really have to play with Schema.xml, but go take a look if curious

Monday, July 30, 12
slide-24
SLIDE 24

GeoSpatial

http://ericlondon.com/posts/240-geospatial-apache-solr-searching-in-drupal-7-using- the-search-api-module-ubuntu-version

Monday, July 30, 12
slide-25
SLIDE 25

Questions?

Monday, July 30, 12
slide-26
SLIDE 26

Thank You!

Monday, July 30, 12