QueryElevationComponent

QueryElevationComponentを読んでみる。

http://www.jarvana.com/jarvana/view/org/apache/solr/solr-core/1.4.1/solr-core-1.4.1-sources.jar!/org/apache/solr/handler/component/QueryElevationComponent.java?format=ok

QueryElevationComponentの実装は以下のようになっている。

prepareメソッドでクエリとソートのルールを差し替えている
processメソッドはからっぽになっていてなにもしない
QueryElevationComponentは分散検索には対応していないため(?)その他のメソッドも実装されていない。

だから、QueryElevationComponentを理解するにはprepareメソッドを理解すれば十分だ。

prepareメソッドでやっていることを先にまとめると、以下のような感じ。

パラメータの読み込み
設定(elevation.xml)に応じたboosterインスタンス生成
クエリを操作する
ソート順を操作する

prepareメソッドを細かく読んでいく。

  @Override
  public void prepare(ResponseBuilder rb) throws IOException
  {
    SolrQueryRequest req = rb.req;
    SolrParams params = req.getParams();
    // A runtime param can skip 
    if( !params.getBool( ENABLE, true ) ) {
      return;
    }

    // A runtime parameter can alter the config value for forceElevation
    boolean force = params.getBool( FORCE_ELEVATION, forceElevation );

ここまでがパラメータの読み込み。

    Query query = rb.getQuery();
    String qstr = rb.getQueryString();
    if( query == null || qstr == null) {
      return;
    }

    qstr = getAnalyzedQuery(qstr);
    IndexReader reader = req.getSearcher().getReader();
    ElevationObj booster = null;
    try {
      booster = getElevationMap( reader, req.getCore() ).get( qstr );
    }
    catch( Exception ex ) {
      throw new SolrException( SolrException.ErrorCode.SERVER_ERROR,
          "Error loading elevation", ex );      
    }

クエリの解析とキーワードに対応したboosterの取得をしている。
boosterの内容は以下の通り

型はElevationObjクラス
- Elevation別途インナークラスとして定義されている。
- ElevationObjはコンストラクタだけのクラス

ElevationObjのコンストラクタは以下の定義で、引数は以下を意味する。

qstrはキーワード
elevateはexclude=trueに指定されていないコンテンツのIDのリスト
excludeはexclude=trueに指定されているコンテンツのIDのリスト

    ElevationObj( String qstr, List<String> elevate, List<String> exclude ) throws IOException
    {
      this.text = qstr;
      this.analyzed = getAnalyzedQuery( this.text );
      
      this.include = new BooleanQuery();
      this.include.setBoost( 0 );
      this.priority = new HashMap<String, Integer>();
      int max = elevate.size()+5;
      for( String id : elevate ) {
        TermQuery tq = new TermQuery( new Term( idField, id ) );
        include.add( tq, BooleanClause.Occur.SHOULD );
        this.priority.put( id, max-- );
      }

includeフィールドにはIDをキーにしたOR検索(SHOULD条件)のBooleanQueryが入っている。

      if( exclude == null || exclude.isEmpty() ) {
        this.exclude = null;
      }
      else {
        this.exclude = new BooleanClause[exclude.size()];
        for( int i=0; i<exclude.size(); i++ ) {
          TermQuery tq = new TermQuery( new Term( idField, exclude.get(i) ) );
          this.exclude[i] = new BooleanClause( tq, BooleanClause.Occur.MUST_NOT );
        }
      }

excludeフィールドにはIDが含まれないという条件のBooleanClauseのリストが入っている。

      this.comparatorSource = new ElevationComparatorSource(priority);
    }

最後に、comparatorSourceフィールドには独自のソート用比較クラスを入れている。

prepareメソッドに戻る。

    if( booster != null ) {
      // Change the query to insert forced documents
      BooleanQuery newq = new BooleanQuery( true );
      newq.add( query, BooleanClause.Occur.SHOULD );
      newq.add( booster.include, BooleanClause.Occur.SHOULD );
      if( booster.exclude != null ) {
        for( BooleanClause bq : booster.exclude ) {
          newq.add( bq );
        }
      }
      rb.setQuery( newq );

検索条件を以下のように組み替えている

booster.includeをクエリに追加し、elevationしたいコンテンツが検索結果に入るようにする
boolean.excludeもクエリに追加し、excludeしたいコンテンツが検索結果に入らないようにする

      // if the sort is 'score desc' use a custom sorting method to 
      // insert documents in their proper place 
      SortSpec sortSpec = rb.getSortSpec();
      if( sortSpec.getSort() == null ) {
        sortSpec.setSort( new Sort( new SortField[] {
            new SortField(idField, booster.comparatorSource, false ),
            new SortField(null, SortField.SCORE, false)
        }));
      }
      else {
        // Check if the sort is based on score
        boolean modify = false;
        SortField[] current = sortSpec.getSort().getSort();
        ArrayList<SortField> sorts = new ArrayList<SortField>( current.length + 1 );
        // Perhaps force it to always sort by score
        if( force && current[0].getType() != SortField.SCORE ) {
          sorts.add( new SortField(idField, booster.comparatorSource, false ) );
          modify = true;
        }
        for( SortField sf : current ) {
          if( sf.getType() == SortField.SCORE ) {
            sorts.add( new SortField(idField, booster.comparatorSource, sf.getReverse() ) );
            modify = true;
          }
          sorts.add( sf );
        }
        if( modify ) {
          sortSpec.setSort( new Sort( sorts.toArray( new SortField[sorts.size()] ) ) );
        }
      }

さらにソートの方法の指定にbooster.comparatorSourceを追加している。
これにより、include指定したコンテンツが必ず一番上に登場する。

最後は省略するが、debug=trueの場合の情報追加で終わる。

最初、QueryElevationQueryは検索結果の先頭に無理やりコンテンツを詰め込んだり、検索結果から無理やり削除したりしてinclude/excludeを実現しているのかと思っていたんだけど、ソースを読んだらクエリとソート順を検索前(prepareメソッド)に操作して、順番を変化させていた。

こんなにシンプルに検索結果の操作ができるんだなあと改めてSolrの柔軟性に感心した次第。