API Reference¶
Authenticating user(s)¶
-
class
usos.authentication.
Credentials
(username, password)¶ Manages and stores credentials required to successfully authenticate the user.
Firing up a new user instance is very simple:
from usos.authentication import Credentials credentials = Credentials( username="123456", password="password")
That way, you can set up and manage multiple accounts, for example:
john = Credentials(username="johndoe", password="...") anna = Credentials(username="anna1995", password="...")
Parameters: - username (
str
) – username used in the Central Authentication System - password (
str
) – password bound to the username
- username (
-
class
usos.authentication.
Authentication
(credentials, root_url, web_driver)¶ Authenticates the user with provided credentials.
Once you retrieve an instance of user’s
Credentials
, you can proceed further with authentication. In this example, an environmental variableUSOS_SCRAPER_ROOT_URL
is used as a root URL of the requested USOSweb application.from usos.authentication import Authentication ... auth = Authentication( credentials=john, root_url=os.environ["USOS_SCRAPER_ROOT_URL"], web_driver=selenium_driver)
Parameters: - credentials (
object
) – username and password in an instance ofCredentials
. - root_url (
str
) – root url of the USOSweb application the login procedure will be executed on. - web_driver (
object
) – an instance ofusos.web_driver.SeleniumDriver
responsible for controlling the browser.
-
is_authenticated
()¶ Checks whether the user is authenticated.
Current implementation of this method firstly checks whether the user has been recently signed in, and if that condition is negative, attempts to execute
sign_in()
.if auth.is_authenticated(): # Even though the user has not been signed in before, # the method will attempt to do it for him and if it # succeeds, do_serious_stuff() will be executed. do_serious_stuff()
Return type: bool
Returns: False
only if the authenication process has failed.
-
sign_in
()¶ Performs the sign in procedure using a
web_driver
provided to the initializer.Return type: bool
Returns: True
if the procedure was successful.
- credentials (
Managing web drivers¶
-
class
usos.web_driver.
SeleniumDriver
(headless, config={})¶ Provides a layer of abstraction to obtain a preconfigured object of a Selenium-compatible web driver.
Parameters: - headless (
bool
) – whether the driver should run in headless mode - config (
dict
) – set of config variables to tweak the behaviour of the web driver
-
exception_take_screenshot
(codename)¶ Takes a screenshot of web driver’s current viewport.
This method can be utilized as a tool for troubleshooting non-trivial errors with parsing.
def perform-login-example(self, usr: str, pwd: str) -> object: try: ... return get_user_instance(usr, pwd) except: driver.exception_take_screenshot("perform-login") logging.exception("Could not retrieve user instance")
Parameters: codename ( str
) – name of the exception/event that will be added to the image’s filenameReturn type: None
-
get_instance
()¶ Returns an instance of selected web driver.
Changing the implementation of this method will allow you to integrate different web drivers and replace Chrome (the default) with Firefox, PhantomJS or something else.
To find out more, read Using custom web drivers.
Return type: object
Returns: by default - an object of ChromeDriver. Can be extended.
-
quit
()¶ Forces the web driver to terminate.
Return type: None
-
reset
()¶ Resets the instance of a web driver to
None
.This method might prove useful while switching between different web drivers.
Return type: None
- headless (
Retrieving data¶
-
class
usos.scraper.
Scraper
(root_url, destinations, authentication, data_controller, web_driver)¶ Navigates the interface and scrapes the data.
Parameters: - root_url (
str
) – a root url for the USOSweb interface. - destinations (
str
) – - authentication (
object
) – an instance ofusos.authentication.Authentication
for accessing protected data. - data_controller (
object
) – a controller for storing and analysing scraped data. - web_driver (
object
) – a Selenium web driver instance for navigating.
-
go_to
(destination)¶ Navigates to the provided destination.
Parameters: destination ( str
) – a part of the url that will be used to match the ScrapingTemplate.Return type: None
-
quit
()¶ Terminates the scraper.
Return type: None
-
run
()¶ Runs the process of iterating through provided destinations.
Return type: None
- root_url (
Storing and analysing the data¶
-
class
usos.data.
DataController
(dispatcher)¶ Stores and performs analysis of collected data.
from usos.data import DataController from usos.notifications import Dispatcher my_dispatcher = Dispatcher( channels="Email SMS WebPush MessengerPigeon", enable=True, config_file="notifications.json") data = DataController(dispatcher=my_dispatcher) single_entry = { 'entity': 'example-entity-type', 'items': [...] } data.upload(single_entry) ... data.analyze()
Parameters: dispatcher ( object
) – instance ofusos.notifications.Dispatcher
responsible for providing the notifications via available channels.-
_compare
(old, new)¶ Compares two entities between eachother.
Parameters: - old (
dict
) – locally stored entity. - new (
dict
) – newly retrieved entity.
Return type: None
- old (
-
_compare_items
(old, new, append_if_missing=False)¶ Compares two lists of items.
Parameters: - old (
list
) – items from an old entity. - new (
list
) – items from a new entity. - append_if_missing (
bool
) – whether the item should be added to the final results if it is present innew
but missing inold
.
Return type: list
Returns: items that share the same identifiers but with updated values.
- old (
-
_get_filename
(data)¶ Returns a filename based on the data’s entity-type.
>>> grades = { ... "entity": "final-grades", ... "items": [] ... } >>> course_results = { ... "entity": "course-results-tree", ... "items": [ ... { ... "group": "28-INF-S-DOLI", ... "subgroup": "Logic for Computer Science", ... "hierarchy": "/Exam", ... "item": "Results", ... "values": ["104.5 pkt"] ... } ... ] ... } >>> data._get_filename(grades) 'data/final-grades.json' >>> data._get_filename(course_results) 'data/courses/28-inf-s-doli.json'
Return type: str
Returns: filename of the json file for a given entity.
-
_load
(filename)¶ Loads the data from a specified JSON file.
Parameters: filename ( str
) – name of the JSON file to load the data from.Return type: dict
Returns: an entity retrieved from a file.
-
_same_item
(old, new)¶ Checks whether a given new item carries the same identifiers as the old one.
>>> rectangle = { ... "group": "Shapes", ... "subgroup": "Two-dimensional", ... "item": "A rectangle", ... "values": [10, 20] ... } >>> new_rectangle = { ... "group": "Shapes", ... "subgroup": "Two-dimensional", ... "item": "A rectangle", ... "values": [30, 50] ... } >>> square = { ... "group": "Shapes", ... "subgroup": "Two-dimensional", ... "item": "A square", ... "values": [10, 10] ... } >>> data._same_item(rectangle, square) False >>> data._same_item(rectangle, new_rectangle) True
Parameters: - old (
dict
) – an element from the list of items of an old entity. - new (
dict
) – an element from the list of items of a new entity.
Return type: bool
- old (
-
_save
(filename, data)¶ Saves the data to a specified JSON file.
Parameters: - filename (
str
) – name of the JSON file to save the data to. - data (
dict
) – data to store.
Return type: None
- filename (
-
analyze
()¶ Analyzes the data stored in the temporary storage and passes the results to the notifications’ dispatcher.
Return type: None
-
upload
(item)¶ Uploads a given item to a temporary data storage.
The
upload()
method works on dictionaries structured as entities.Learn more about entities here: Defining new entities.
data.upload({ 'entity': 'example-entity-type', 'items': [...] })
Parameters: item ( dict
) – item in an entity-compatible format.Return type: None
-
upload_multiple
(items)¶ Uploads a list of items to a temporary data storage.
Internally uses the
upload()
method on every item provided in the list.my_final_grades = { 'entity': 'final-grades', 'items: [...] } entities = {} for i in range(0, 15): entities.append({ 'entity': 'iterations-for-analysis' 'items': [ 'group': 'Just a for loop', 'item': 'A single iteration', 'values': [i, i+1, i+2] ] }) entities = entities.append(my_final_grades) data.upload_multiple(entities)
Parameters: items ( list
) – items in an entity-compatible format.Return type: None
-
-
exception
usos.data.
NotAnEntity
¶ An item is not in an entity-compatible format.
Dispatching notifications¶
-
class
usos.notifications.
Dispatcher
(channels, enable, config_file)¶ Allows for sending multiple messages via configured channels.
Creating a new dispatcher that operates on selected channels:
from usos.notifications import Dispatcher my_dispatcher = Dispatcher( channels="Email SMS WebPush MessengerPigeon", enable=True, config_file="notifications.json")
Parameters: - channels (
str
) – names of the channels separated by a single space. - enable (
bool
) – whether to allow the dispatcher to send any notifications. - config_file (
str
) – path to a file that contains channel-specific variables such as API Keys or special parameters.
-
send
(data)¶ Sends notifications via channels set in the initializer.
Parameters: data ( dict
) – the data that will be sent.Return type: bool
Returns: True
if every notification has been sent successfuly on every channel.
-
send_single
(channel, data)¶ Sends notifications via a single, given channel.
Parameters: - channel (
str
) – a name of the channel. - data (
dict
) – data that will be used to render the templates.
Return type: bool
Returns: True
if the notifications have been sent successfuly on a given channel.- channel (
- channels (
-
class
usos.notifications.
Notification
(data, config={})¶ Provides a common layer of abstraction for every existing channel.
Use this class for inheritance while implementing custom streams:
class PaperMail(Notification): def _render(self) -> None: letter: str = "Hey, {name}! " + "{message} " + "Take care, {author}." letter = letter.format( name=data["recipient"], message=data["message"], author=data["sender"]) self._rendered_template = letter def _send(self) -> bool: put_in_a_mailbox(self._rendered_template) return True
Now it can be used as a channel named
PaperMail
:dispatcher = Dispatcher( channels="PaperMail", enable=True, config_file="mailbox_coordinates.json") my_message = { "recipient": "Kate", "message": "I'm getting a divorce.", "sender": "Anthony" } dispatcher.send(my_message)
Read more at: Implementing additional Streams (channels).
Parameters: - data (
dict
) – data that will be used in the rendering of the final message. - config (
dict
) – variables that can be used for configuration purposes such as API Keys or custom parameters.
-
render
()¶ Renders the template that will be sent in a notification.
For rendering the template, this method uses a private
_render()
method.Return type: str
Returns: a rendered template.
-
render_and_send
()¶ Renders the template and then sends a notification.
This method is an equivalent of calling
render()
andsend()
separately.Return type: bool
Returns: True
if a notification has been sent successfuly.
-
send
()¶ Sends a notification if the rendered template isn’t empty.
For sending a notification, this method uses a private
_send()
method.Return type: bool
Returns: True
if a notification has been sent successfuly.
-
template_output
()¶ Returns the output of a template for a channel.
This method will return an empty string if you try to retrieve a template without rendering it first.
>>> mail = PaperMail(my_message, {"DeliverOnTime": False}) >>> mail.template_output() '' >>> mail.render() >>> mail.template_output() 'Hey, Kate! I'm getting a divorce. Take care, Anthony.'
Return type: str
Returns: a rendered template.
- data (