-
- All Implemented Interfaces:
-
ai.platon.pulsar.common.config.Configurable,ai.platon.pulsar.crawl.protocol.Protocol,java.lang.AutoCloseable
public abstract class AbstractHttpProtocol implements Protocol
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description public classAbstractHttpProtocol.Companion
-
Field Summary
Fields Modifier and Type Field Description private final BooleanisActiveprivate final BooleansupportParallel
-
Constructor Summary
Constructors Constructor Description AbstractHttpProtocol()
-
Method Summary
Modifier and Type Method Description final BooleangetIsActive()BooleangetSupportParallel()ImmutableConfiggetConf()UnitsetConf(ImmutableConfig jobConf)Unitreset()Reset the protocol environment, so the peer host view the client as a new one Collection<Response>getResponses(Collection<WebPage> pages, VolatileConfig volatileConfig)ProtocolOutputgetProtocolOutput(WebPage page)Returns the ProtocolOutput for a fetch list entry. ProtocolOutputgetProtocolOutputDeferred(WebPage page)Returns the ProtocolOutput for a fetch list entry. abstract ResponsegetResponse(WebPage page, Boolean followRedirects)abstract ResponsegetResponseDeferred(WebPage page, Boolean followRedirects)BaseRobotRulesgetRobotRules(WebPage page)Retrieve robot rules applicable for this url. Unitclose()StringtoString()-
-
Method Detail
-
getIsActive
final Boolean getIsActive()
-
getSupportParallel
Boolean getSupportParallel()
-
getConf
ImmutableConfig getConf()
-
getResponses
Collection<Response> getResponses(Collection<WebPage> pages, VolatileConfig volatileConfig)
-
getProtocolOutput
ProtocolOutput getProtocolOutput(WebPage page)
Returns the ProtocolOutput for a fetch list entry.
-
getProtocolOutputDeferred
ProtocolOutput getProtocolOutputDeferred(WebPage page)
Returns the ProtocolOutput for a fetch list entry.
-
getResponse
abstract Response getResponse(WebPage page, Boolean followRedirects)
-
getResponseDeferred
abstract Response getResponseDeferred(WebPage page, Boolean followRedirects)
-
getRobotRules
BaseRobotRules getRobotRules(WebPage page)
Retrieve robot rules applicable for this url.
- Parameters:
page- The Web page
-
-
-
-