Package ai.platon.pulsar.crawl.filter
See: Description
-
Class Summary Class Description BlockFilter CrawlFilter TODO : configurable CrawlFilters TODO : need full unit test TODO : Move to plugin, urlfilter/contentfilter, etc CrawlUrlFilters Creates and caches CrawlUrlFilter implementing plugins. CrawlUrlNormalizers This class uses a "chained filter" pattern to run defined normalizers. TextFilter UrlFilterException -
Interface Summary Interface Description CrawlUrlFilter Interface used to limit which URLs enter AppConstants. CrawlUrlNormalizer Interface used to convert URLs to normal form and optionally perform substitutions