-
- All Implemented Interfaces:
-
ai.platon.pulsar.common.config.Parameterized
public class LoadOptions extends CommonOptions
Created by vincent on 19-4-24. Copyright @ 2013-2017 Platon AI. All rights reserved
NOTICE: every option with name
optionNamehas to take a Parameter name -optionNameNOTICE: every load task should has it's own load options, it's bad to share one load options
TODO: consider make LoadOptions be visible by all modules
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description public classLoadOptions.Companion
-
Field Summary
-
Constructor Summary
Constructors Constructor Description LoadOptions(Array<String> argv, VolatileConfig conf, PulsarEventHandler eventHandler)
-
Method Summary
Modifier and Type Method Description final StringgetLabel()final UnitsetLabel(String label)final StringgetTaskId()final UnitsetTaskId(String taskId)final InstantgetTaskTime()The task time accepts date time format as the following: ISO_INSTANT: yyyy-MM-ddThh:MM:ssZ
yyyy-MM-dd[ hh[:MM:ss]]
final UnitsetTaskTime(Instant taskTime)The task time accepts date time format as the following: ISO_INSTANT: yyyy-MM-ddThh:MM:ssZ
yyyy-MM-dd[ hh[:MM:ss]]
final InstantgetDeadTime()final UnitsetDeadTime(Instant deadTime)final StringgetAuthToken()final UnitsetAuthToken(String authToken)final BooleangetReadonly()final UnitsetReadonly(Boolean readonly)final BooleangetIsResource()final UnitsetIsResource(Boolean isResource)final DurationgetExpires()Web page expiry time The term "expires" usually be used for a expiry time, for example, http-equiv, or in cookie specification, guess it means "expires at"The expires field supports both ISO-8601 standard and hadoop time duration format ISO-8601 standard : PnDTnHnMn.nS Hadoop time duration format : Valid units are : ns, us, ms, s, m, h, d. final UnitsetExpires(Duration expires)Web page expiry time The term "expires" usually be used for a expiry time, for example, http-equiv, or in cookie specification, guess it means "expires at"The expires field supports both ISO-8601 standard and hadoop time duration format ISO-8601 standard : PnDTnHnMn.nS Hadoop time duration format : Valid units are : ns, us, ms, s, m, h, d. final InstantgetExpireAt()The page is expired if the current time expireAt final UnitsetExpireAt(Instant expireAt)The page is expired if the current time expireAt final DurationgetFetchInterval()The page is expired if the current time expireAt final UnitsetFetchInterval(Duration fetchInterval)The page is expired if the current time expireAt final StringgetOutLinkSelector()Arrange links final UnitsetOutLinkSelector(String outLinkSelector)Arrange links final StringgetOutLinkPattern()final UnitsetOutLinkPattern(String outLinkPattern)final StringgetClickTarget()final UnitsetClickTarget(String clickTarget)final StringgetNextPageSelector()final UnitsetNextPageSelector(String nextPageSelector)final IntegergetIframe()final UnitsetIframe(Integer iframe)final IntegergetTopLinks()final UnitsetTopLinks(Integer topLinks)final IntegergetTopNAnchorGroups()final UnitsetTopNAnchorGroups(Integer topNAnchorGroups)final StringgetWaitNonBlank()final UnitsetWaitNonBlank(String waitNonBlank)final StringgetRequireNotBlank()final UnitsetRequireNotBlank(String requireNotBlank)final IntegergetRequireSize()final UnitsetRequireSize(Integer requireSize)final IntegergetRequireImages()final UnitsetRequireImages(Integer requireImages)final IntegergetRequireAnchors()final UnitsetRequireAnchors(Integer requireAnchors)final FetchModegetFetchMode()final UnitsetFetchMode(FetchMode fetchMode)final BrowserTypegetBrowser()TODO: session scope browser choice is not support by now final UnitsetBrowser(BrowserType browser)TODO: session scope browser choice is not support by now final IntegergetScrollCount()final UnitsetScrollCount(Integer scrollCount)final DurationgetScrollInterval()final UnitsetScrollInterval(Duration scrollInterval)final DurationgetScriptTimeout()final UnitsetScriptTimeout(Duration scriptTimeout)final DurationgetPageLoadTimeout()final UnitsetPageLoadTimeout(Duration pageLoadTimeout)final BrowserTypegetItemBrowser()TODO: session scope browser choice is not support by now final UnitsetItemBrowser(BrowserType itemBrowser)TODO: session scope browser choice is not support by now final DurationgetItemExpires()final UnitsetItemExpires(Duration itemExpires)final InstantgetItemExpireAt()Web page expire time final UnitsetItemExpireAt(Instant itemExpireAt)Web page expire time final IntegergetItemScrollCount()Note: if scroll too many times, the page may fail to calculate the vision information final UnitsetItemScrollCount(Integer itemScrollCount)Note: if scroll too many times, the page may fail to calculate the vision information final DurationgetItemScrollInterval()final UnitsetItemScrollInterval(Duration itemScrollInterval)final DurationgetItemScriptTimeout()final UnitsetItemScriptTimeout(Duration itemScriptTimeout)final DurationgetItemPageLoadTimeout()final UnitsetItemPageLoadTimeout(Duration itemPageLoadTimeout)final StringgetItemRequireNotBlank()final UnitsetItemRequireNotBlank(String itemRequireNotBlank)final IntegergetItemRequireSize()final UnitsetItemRequireSize(Integer itemRequireSize)final IntegergetItemRequireImages()final UnitsetItemRequireImages(Integer itemRequireImages)final IntegergetItemRequireAnchors()final UnitsetItemRequireAnchors(Integer itemRequireAnchors)final BooleangetShortenKey()final UnitsetShortenKey(@Deprecated(message = Use ignoreQuery instead) Boolean shortenKey)final BooleangetIgnoreQuery()final UnitsetIgnoreQuery(Boolean ignoreQuery)final BooleangetPersist()final UnitsetPersist(Boolean persist)final BooleangetStoreContent()final UnitsetStoreContent(Boolean storeContent)final BooleangetRefresh()final UnitsetRefresh(Boolean refresh)final BooleangetRetryFailed()Force retry fetching the page if it's failed last time, or it's marked as gone This option is deprecated and be replaced by ignoreFailure which is more descriptive final UnitsetRetryFailed(@Deprecated(message = Replaced by ignoreFailure, will be removed in further versions) Boolean retryFailed)Force retry fetching the page if it's failed last time, or it's marked as gone This option is deprecated and be replaced by ignoreFailure which is more descriptive final BooleangetIgnoreFailure()Force retry fetching the page if it's failed last time, or it's marked as gone final UnitsetIgnoreFailure(Boolean ignoreFailure)Force retry fetching the page if it's failed last time, or it's marked as gone final IntegergetNMaxRetry()final UnitsetNMaxRetry(Integer nMaxRetry)final IntegergetNJitRetry()final UnitsetNJitRetry(Integer nJitRetry)final BooleangetLazyFlush()final UnitsetLazyFlush(Boolean lazyFlush)final BooleangetPreferParallel()final UnitsetPreferParallel(Boolean preferParallel)final BooleangetIncognito()final UnitsetIncognito(Boolean incognito)final BooleangetBackground()final UnitsetBackground(Boolean background)final BooleangetNoRedirect()final UnitsetNoRedirect(Boolean noRedirect)final BooleangetHardRedirect()final UnitsetHardRedirect(Boolean hardRedirect)final BooleangetParse()final UnitsetParse(Boolean parse)final BooleangetReparseLinks()final UnitsetReparseLinks(Boolean reparseLinks)final BooleangetIgnoreUrlQuery()final UnitsetIgnoreUrlQuery(Boolean ignoreUrlQuery)final BooleangetNoNorm()final UnitsetNoNorm(Boolean noNorm)final BooleangetNoFilter()final UnitsetNoFilter(Boolean noFilter)final ConditiongetNetCondition()final UnitsetNetCondition(Condition netCondition)final IntegergetTest()final UnitsetTest(Integer test)final StringgetVersion()final UnitsetVersion(String version)final StringgetOutLinkSelectorOrNull()final StringgetCorrectedOutLinkSelector()final StringgetReferrer()final UnitsetReferrer(String referrer)ParamsgetModifiedParams()Map<String, Object>getModifiedOptions()final VolatileConfiggetConf()final PulsarEventHandlergetEventHandler()final UnitsetEventHandler(PulsarEventHandler eventHandler)BooleangetIsHelp()UnitsetIsHelp(Boolean isHelp)final BooleangetExpandAtSign()final UnitsetExpandAtSign(Boolean expandAtSign)final BooleangetAcceptUnknownOptions()final UnitsetAcceptUnknownOptions(Boolean acceptUnknownOptions)final BooleangetAllowParameterOverwriting()final UnitsetAllowParameterOverwriting(Boolean allowParameterOverwriting)final StringgetArgs()final Array<String>getArgv()The argument vector Booleanparse()Parse with parameter overwriting fix LoadOptionscreateItemOptions()final BooleanisExpired(Instant prevFetchTime)Check if the page has been expired. final BooleanisDead()UnititemOptions2MajorOptions()final VolatileConfigoverrideConfiguration()final VolatileConfigoverrideConfiguration(VolatileConfig conf)BooleanisDefault(String option)ParamsgetParams()StringtoString()Booleanequals(Object other)IntegerhashCode()LoadOptionsclone()Create a new LoadOptions -
Methods inherited from class ai.platon.pulsar.common.options.CommonOptions
addObjects, parseOrExit, setObjects, toArgsMap, toArgv, toCmdLine, toMutableArgsMap, usage -
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
-
Constructor Detail
-
LoadOptions
LoadOptions(Array<String> argv, VolatileConfig conf, PulsarEventHandler eventHandler)
-
-
Method Detail
-
getTaskTime
final Instant getTaskTime()
The task time accepts date time format as the following:
ISO_INSTANT: yyyy-MM-ddThh:MM:ssZ
yyyy-MM-dd[ hh[:MM:ss]]
-
setTaskTime
final Unit setTaskTime(Instant taskTime)
The task time accepts date time format as the following:
ISO_INSTANT: yyyy-MM-ddThh:MM:ssZ
yyyy-MM-dd[ hh[:MM:ss]]
-
getDeadTime
final Instant getDeadTime()
-
setDeadTime
final Unit setDeadTime(Instant deadTime)
-
getAuthToken
final String getAuthToken()
-
setAuthToken
final Unit setAuthToken(String authToken)
-
getReadonly
final Boolean getReadonly()
-
setReadonly
final Unit setReadonly(Boolean readonly)
-
getIsResource
final Boolean getIsResource()
-
setIsResource
final Unit setIsResource(Boolean isResource)
-
getExpires
final Duration getExpires()
Web page expiry time The term "expires" usually be used for a expiry time, for example, http-equiv, or in cookie specification, guess it means "expires at"
The expires field supports both ISO-8601 standard and hadoop time duration format ISO-8601 standard : PnDTnHnMn.nS Hadoop time duration format : Valid units are : ns, us, ms, s, m, h, d.
-
setExpires
final Unit setExpires(Duration expires)
Web page expiry time The term "expires" usually be used for a expiry time, for example, http-equiv, or in cookie specification, guess it means "expires at"
The expires field supports both ISO-8601 standard and hadoop time duration format ISO-8601 standard : PnDTnHnMn.nS Hadoop time duration format : Valid units are : ns, us, ms, s, m, h, d.
-
getExpireAt
final Instant getExpireAt()
The page is expired if the current time expireAt
-
setExpireAt
final Unit setExpireAt(Instant expireAt)
The page is expired if the current time expireAt
-
getFetchInterval
final Duration getFetchInterval()
The page is expired if the current time expireAt
-
setFetchInterval
final Unit setFetchInterval(Duration fetchInterval)
The page is expired if the current time expireAt
-
getOutLinkSelector
final String getOutLinkSelector()
Arrange links
-
setOutLinkSelector
final Unit setOutLinkSelector(String outLinkSelector)
Arrange links
-
getOutLinkPattern
final String getOutLinkPattern()
-
setOutLinkPattern
final Unit setOutLinkPattern(String outLinkPattern)
-
getClickTarget
final String getClickTarget()
-
setClickTarget
final Unit setClickTarget(String clickTarget)
-
getNextPageSelector
final String getNextPageSelector()
-
setNextPageSelector
final Unit setNextPageSelector(String nextPageSelector)
-
getTopLinks
final Integer getTopLinks()
-
setTopLinks
final Unit setTopLinks(Integer topLinks)
-
getTopNAnchorGroups
final Integer getTopNAnchorGroups()
-
setTopNAnchorGroups
final Unit setTopNAnchorGroups(Integer topNAnchorGroups)
-
getWaitNonBlank
final String getWaitNonBlank()
-
setWaitNonBlank
final Unit setWaitNonBlank(String waitNonBlank)
-
getRequireNotBlank
final String getRequireNotBlank()
-
setRequireNotBlank
final Unit setRequireNotBlank(String requireNotBlank)
-
getRequireSize
final Integer getRequireSize()
-
setRequireSize
final Unit setRequireSize(Integer requireSize)
-
getRequireImages
final Integer getRequireImages()
-
setRequireImages
final Unit setRequireImages(Integer requireImages)
-
getRequireAnchors
final Integer getRequireAnchors()
-
setRequireAnchors
final Unit setRequireAnchors(Integer requireAnchors)
-
getFetchMode
final FetchMode getFetchMode()
-
setFetchMode
final Unit setFetchMode(FetchMode fetchMode)
-
getBrowser
final BrowserType getBrowser()
TODO: session scope browser choice is not support by now
-
setBrowser
final Unit setBrowser(BrowserType browser)
TODO: session scope browser choice is not support by now
-
getScrollCount
final Integer getScrollCount()
-
setScrollCount
final Unit setScrollCount(Integer scrollCount)
-
getScrollInterval
final Duration getScrollInterval()
-
setScrollInterval
final Unit setScrollInterval(Duration scrollInterval)
-
getScriptTimeout
final Duration getScriptTimeout()
-
setScriptTimeout
final Unit setScriptTimeout(Duration scriptTimeout)
-
getPageLoadTimeout
final Duration getPageLoadTimeout()
-
setPageLoadTimeout
final Unit setPageLoadTimeout(Duration pageLoadTimeout)
-
getItemBrowser
final BrowserType getItemBrowser()
TODO: session scope browser choice is not support by now
-
setItemBrowser
final Unit setItemBrowser(BrowserType itemBrowser)
TODO: session scope browser choice is not support by now
-
getItemExpires
final Duration getItemExpires()
-
setItemExpires
final Unit setItemExpires(Duration itemExpires)
-
getItemExpireAt
final Instant getItemExpireAt()
Web page expire time
-
setItemExpireAt
final Unit setItemExpireAt(Instant itemExpireAt)
Web page expire time
-
getItemScrollCount
final Integer getItemScrollCount()
Note: if scroll too many times, the page may fail to calculate the vision information
-
setItemScrollCount
final Unit setItemScrollCount(Integer itemScrollCount)
Note: if scroll too many times, the page may fail to calculate the vision information
-
getItemScrollInterval
final Duration getItemScrollInterval()
-
setItemScrollInterval
final Unit setItemScrollInterval(Duration itemScrollInterval)
-
getItemScriptTimeout
final Duration getItemScriptTimeout()
-
setItemScriptTimeout
final Unit setItemScriptTimeout(Duration itemScriptTimeout)
-
getItemPageLoadTimeout
final Duration getItemPageLoadTimeout()
-
setItemPageLoadTimeout
final Unit setItemPageLoadTimeout(Duration itemPageLoadTimeout)
-
getItemRequireNotBlank
final String getItemRequireNotBlank()
-
setItemRequireNotBlank
final Unit setItemRequireNotBlank(String itemRequireNotBlank)
-
getItemRequireSize
final Integer getItemRequireSize()
-
setItemRequireSize
final Unit setItemRequireSize(Integer itemRequireSize)
-
getItemRequireImages
final Integer getItemRequireImages()
-
setItemRequireImages
final Unit setItemRequireImages(Integer itemRequireImages)
-
getItemRequireAnchors
final Integer getItemRequireAnchors()
-
setItemRequireAnchors
final Unit setItemRequireAnchors(Integer itemRequireAnchors)
-
getShortenKey
final Boolean getShortenKey()
-
setShortenKey
final Unit setShortenKey(@Deprecated(message = Use ignoreQuery instead) Boolean shortenKey)
-
getIgnoreQuery
final Boolean getIgnoreQuery()
-
setIgnoreQuery
final Unit setIgnoreQuery(Boolean ignoreQuery)
-
getPersist
final Boolean getPersist()
-
setPersist
final Unit setPersist(Boolean persist)
-
getStoreContent
final Boolean getStoreContent()
-
setStoreContent
final Unit setStoreContent(Boolean storeContent)
-
getRefresh
final Boolean getRefresh()
-
setRefresh
final Unit setRefresh(Boolean refresh)
-
getRetryFailed
final Boolean getRetryFailed()
Force retry fetching the page if it's failed last time, or it's marked as gone This option is deprecated and be replaced by ignoreFailure which is more descriptive
-
setRetryFailed
final Unit setRetryFailed(@Deprecated(message = Replaced by ignoreFailure, will be removed in further versions) Boolean retryFailed)
Force retry fetching the page if it's failed last time, or it's marked as gone This option is deprecated and be replaced by ignoreFailure which is more descriptive
-
getIgnoreFailure
final Boolean getIgnoreFailure()
Force retry fetching the page if it's failed last time, or it's marked as gone
-
setIgnoreFailure
final Unit setIgnoreFailure(Boolean ignoreFailure)
Force retry fetching the page if it's failed last time, or it's marked as gone
-
getNMaxRetry
final Integer getNMaxRetry()
-
setNMaxRetry
final Unit setNMaxRetry(Integer nMaxRetry)
-
getNJitRetry
final Integer getNJitRetry()
-
setNJitRetry
final Unit setNJitRetry(Integer nJitRetry)
-
getLazyFlush
final Boolean getLazyFlush()
-
setLazyFlush
final Unit setLazyFlush(Boolean lazyFlush)
-
getPreferParallel
final Boolean getPreferParallel()
-
setPreferParallel
final Unit setPreferParallel(Boolean preferParallel)
-
getIncognito
final Boolean getIncognito()
-
setIncognito
final Unit setIncognito(Boolean incognito)
-
getBackground
final Boolean getBackground()
-
setBackground
final Unit setBackground(Boolean background)
-
getNoRedirect
final Boolean getNoRedirect()
-
setNoRedirect
final Unit setNoRedirect(Boolean noRedirect)
-
getHardRedirect
final Boolean getHardRedirect()
-
setHardRedirect
final Unit setHardRedirect(Boolean hardRedirect)
-
getReparseLinks
final Boolean getReparseLinks()
-
setReparseLinks
final Unit setReparseLinks(Boolean reparseLinks)
-
getIgnoreUrlQuery
final Boolean getIgnoreUrlQuery()
-
setIgnoreUrlQuery
final Unit setIgnoreUrlQuery(Boolean ignoreUrlQuery)
-
getNoFilter
final Boolean getNoFilter()
-
setNoFilter
final Unit setNoFilter(Boolean noFilter)
-
getNetCondition
final Condition getNetCondition()
-
setNetCondition
final Unit setNetCondition(Condition netCondition)
-
getVersion
final String getVersion()
-
setVersion
final Unit setVersion(String version)
-
getOutLinkSelectorOrNull
final String getOutLinkSelectorOrNull()
-
getCorrectedOutLinkSelector
final String getCorrectedOutLinkSelector()
-
getReferrer
final String getReferrer()
-
setReferrer
final Unit setReferrer(String referrer)
-
getModifiedParams
Params getModifiedParams()
-
getModifiedOptions
Map<String, Object> getModifiedOptions()
-
getConf
final VolatileConfig getConf()
-
getEventHandler
final PulsarEventHandler getEventHandler()
-
setEventHandler
final Unit setEventHandler(PulsarEventHandler eventHandler)
-
getExpandAtSign
final Boolean getExpandAtSign()
-
setExpandAtSign
final Unit setExpandAtSign(Boolean expandAtSign)
-
getAcceptUnknownOptions
final Boolean getAcceptUnknownOptions()
-
setAcceptUnknownOptions
final Unit setAcceptUnknownOptions(Boolean acceptUnknownOptions)
-
getAllowParameterOverwriting
final Boolean getAllowParameterOverwriting()
-
setAllowParameterOverwriting
final Unit setAllowParameterOverwriting(Boolean allowParameterOverwriting)
-
createItemOptions
LoadOptions createItemOptions()
-
itemOptions2MajorOptions
Unit itemOptions2MajorOptions()
-
overrideConfiguration
final VolatileConfig overrideConfiguration()
-
overrideConfiguration
final VolatileConfig overrideConfiguration(VolatileConfig conf)
-
getParams
Params getParams()
-
clone
LoadOptions clone()
Create a new LoadOptions
-
-
-