browser_utils.requests_markdown_browser
RequestsMarkdownBrowser
class RequestsMarkdownBrowser(AbstractMarkdownBrowser)
(In preview) An extremely simple Python requests-powered Markdown web browser. This browser cannot run JavaScript, compute CSS, etc. It simply fetches the HTML document, and converts it to Markdown. See AbstractMarkdownBrowser for more details.
__init__
def __init__(start_page: Optional[str] = None,
viewport_size: Optional[int] = 1024 * 8,
downloads_folder: Optional[Union[str, None]] = None,
search_engine: Optional[Union[AbstractMarkdownSearch,
None]] = None,
markdown_converter: Optional[Union[MarkdownConverter,
None]] = None,
requests_session: Optional[Union[requests.Session, None]] = None,
requests_get_kwargs: Optional[Union[Dict[str, Any],
None]] = None)
Instantiate a new RequestsMarkdownBrowser.
Arguments:
start_page
- The page on which the browser starts (default: "about:blank")viewport_size
- Approximately how many characters fit in the viewport. Viewport dimensions are adjusted dynamically to avoid cutting off words (default: 8192).downloads_folder
- Path to where downloads are saved. If None, downloads are disabled. (default: None)search_engine
- An instance of MarkdownSearch, which handles web searches performed by this browser (default: a newBingMarkdownSearch()
with default parameters)markdown_converted
- An instance of a MarkdownConverter used to convert HTML pages and downloads to Markdown (default: a newMarkdownConerter()
with default parameters)request_session
- The session from which to issue requests (default: a newrequests.Session()
instance with default parameters)request_get_kwargs
- Extra parameters passed to evert.get()
call made to requests.
address
@property
def address() -> str
Return the address of the current page.
set_address
def set_address(uri_or_path: str) -> None
Sets the address of the current page. This will result in the page being fetched via the underlying requests session.
Arguments:
uri_or_path
- The fully-qualified URI to fetch, or the path to fetch from the current location. If the URI protocol issearch:
, the remainder of the URI is interpreted as a search query, and a web search is performed. If the URI protocol isfile://
, the remainder of the URI is interpreted as a local absolute file path.
viewport
@property
def viewport() -> str
Return the content of the current viewport.
page_content
@property
def page_content() -> str
Return the full contents of the current page.
page_down
def page_down() -> None
Move the viewport down one page, if possible.
page_up
def page_up() -> None
Move the viewport up one page, if possible.
find_on_page
def find_on_page(query: str) -> Union[str, None]
Searches for the query from the current viewport forward, looping back to the start if necessary.
find_next
def find_next() -> None
Scroll to the next viewport that matches the query
visit_page
def visit_page(path_or_uri: str) -> str
Update the address, visit the page, and return the content of the viewport.
open_local_file
def open_local_file(local_path: str) -> str
Convert a local file path to a file:/// URI, update the address, visit the page, and return the contents of the viewport.