Package 

Class ScoringFilters

  • All Implemented Interfaces:
    ai.platon.pulsar.common.config.Parameterized , ai.platon.pulsar.crawl.scoring.ScoringFilter

    
    public final class ScoringFilters
     implements ScoringFilter
                        
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private final ImmutableConfig conf
    • Method Summary

      Modifier and Type Method Description
      final ImmutableConfig getConf()
      Params getParams()
      Unit initialScore(WebPage page) Calculate a new initial score, used when adding newly discovered pages.
      Unit injectedScore(WebPage page) Calculate a new initial score, used when injecting new pages.
      ScoreVector generatorSortValue(WebPage page, ScoreVector initSort) Calculate a sort value for Generate.
      Unit distributeScoreToOutlinks(WebPage page, WebGraph graph, Collection<WebEdge> outgoingEdges, Integer allCount) Distribute score value from the current page to all its outlinked pages.
      Unit updateScore(WebPage page, WebGraph graph, Collection<WebEdge> incomingEdges) This method calculates a new score during table update, based on the values contributed by inlinked pages.
      Unit updateContentScore(WebPage page)
      Float indexerScore(String url, IndexDocument doc, WebPage page, Float initScore) This method calculates a Lucene document boost.
      String toString()
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • ScoringFilters

        ScoringFilters(ImmutableConfig conf)
      • ScoringFilters

        ScoringFilters(List<ScoringFilter> scoringFilters, ImmutableConfig conf)
    • Method Detail

      • getConf

         final ImmutableConfig getConf()
      • initialScore

         Unit initialScore(WebPage page)

        Calculate a new initial score, used when adding newly discovered pages.

        Parameters:
        page - page row.
      • injectedScore

         Unit injectedScore(WebPage page)

        Calculate a new initial score, used when injecting new pages.

        Parameters:
        page - new page.
      • generatorSortValue

         ScoreVector generatorSortValue(WebPage page, ScoreVector initSort)

        Calculate a sort value for Generate.

        Parameters:
        page - page row.
        initSort - initial sort value, or a value from previous filters in chain
      • distributeScoreToOutlinks

         Unit distributeScoreToOutlinks(WebPage page, WebGraph graph, Collection<WebEdge> outgoingEdges, Integer allCount)

        Distribute score value from the current page to all its outlinked pages.

        Parameters:
        page - page row
        allCount - number of all collected outlinks from the source page
      • updateScore

         Unit updateScore(WebPage page, WebGraph graph, Collection<WebEdge> incomingEdges)

        This method calculates a new score during table update, based on the values contributed by inlinked pages.

        Parameters:
        page - page row
      • indexerScore

         Float indexerScore(String url, IndexDocument doc, WebPage page, Float initScore)

        This method calculates a Lucene document boost.

        Parameters:
        url - url of the page
        doc - document.
        page - page row
        initScore - initial boost value for the Lucene document.