-
- All Implemented Interfaces:
-
ai.platon.pulsar.common.collect.CrawlableFatLinkCollector,ai.platon.pulsar.common.collect.collector.DataCollector,ai.platon.pulsar.common.collect.collector.PriorityDataCollector,kotlin.Comparable
public class CircularHyperlinkCollector extends HyperlinkCollector
-
-
Field Summary
Fields Modifier and Type Field Description private Stringnameprivate final Integersizeprivate final IntegerestimatedSizeprivate UrlNormalizerPipelineurlNormalizerprivate final ConcurrentSkipListMap<String, CrawlableFatLink>fatLinksprivate final PulsarSessionsessionprivate final Queue<NormUrl>seedsprivate final Integercapacityprivate IntegercollectCountprivate final DurationcollectTimeprivate IntegercollectedCountprivate Stringcountryprivate final InstantcreateTimeprivate InstantdeadTimeprivate Stringdistrictprivate final IntegerestimatedExternalSizeprivate final IntegerexternalSizeprivate InstantfirstCollectTimeprivate final Integeridprivate final BooleanisDeadprivate final Set<String>labelsprivate Stringlangprivate InstantlastCollectedTimeprivate final Integerpriority
-
Constructor Summary
Constructors Constructor Description CircularHyperlinkCollector(PulsarSession session, NormUrl seed, Priority13 priority)CircularHyperlinkCollector(PulsarSession session, Queue<NormUrl> seeds, Priority13 priority)
-
Method Summary
-
Methods inherited from class ai.platon.pulsar.common.collect.collector.AbstractPriorityDataCollector
collectTo, collectTo, collectTo, compareTo -
Methods inherited from class ai.platon.pulsar.common.collect.CircularHyperlinkCollector
dump, hasMore, remove, remove -
Methods inherited from class ai.platon.pulsar.common.collect.collector.AbstractDataCollector
deepClear -
Methods inherited from class ai.platon.pulsar.common.collect.HyperlinkCollector
removeAll, removeAll, toString -
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
-
Constructor Detail
-
CircularHyperlinkCollector
CircularHyperlinkCollector(PulsarSession session, NormUrl seed, Priority13 priority)
-
CircularHyperlinkCollector
CircularHyperlinkCollector(PulsarSession session, Queue<NormUrl> seeds, Priority13 priority)
-
-
Method Detail
-
getEstimatedSize
Integer getEstimatedSize()
-
getUrlNormalizer
final UrlNormalizerPipeline getUrlNormalizer()
-
setUrlNormalizer
final Unit setUrlNormalizer(UrlNormalizerPipeline urlNormalizer)
-
getFatLinks
ConcurrentSkipListMap<String, CrawlableFatLink> getFatLinks()
Track the status of this batch, we need a notice when the batch is finished
-
getSession
final PulsarSession getSession()
The pulsar session to use
-
getSeeds
final Queue<NormUrl> getSeeds()
The urls of portal pages from where hyper links are extracted from
-
getCapacity
Integer getCapacity()
-
getCollectCount
Integer getCollectCount()
-
setCollectCount
Unit setCollectCount(Integer collectCount)
-
getCollectTime
Duration getCollectTime()
-
getCollectedCount
Integer getCollectedCount()
-
setCollectedCount
Unit setCollectedCount(Integer collectedCount)
-
getCountry
String getCountry()
-
setCountry
Unit setCountry(String country)
-
getCreateTime
Instant getCreateTime()
-
getDeadTime
Instant getDeadTime()
-
setDeadTime
Unit setDeadTime(Instant deadTime)
-
getDistrict
String getDistrict()
-
setDistrict
Unit setDistrict(String district)
-
getEstimatedExternalSize
Integer getEstimatedExternalSize()
-
getExternalSize
Integer getExternalSize()
-
getFirstCollectTime
Instant getFirstCollectTime()
-
setFirstCollectTime
Unit setFirstCollectTime(Instant firstCollectTime)
-
getLastCollectedTime
Instant getLastCollectedTime()
-
setLastCollectedTime
Unit setLastCollectedTime(Instant lastCollectedTime)
-
getPriority
Integer getPriority()
-
-
-
-