API Reference

Authenticating user(s)

class usos.authentication.Credentials(username, password)

Manages and stores credentials required to successfully authenticate the user.

Firing up a new user instance is very simple:

from usos.authentication import Credentials

credentials = Credentials(
    username="123456", 
    password="password")

That way, you can set up and manage multiple accounts, for example:

john = Credentials(username="johndoe", password="...")
anna = Credentials(username="anna1995", password="...")
Parameters:
  • username (str) – username used in the Central Authentication System
  • password (str) – password bound to the username
class usos.authentication.Authentication(credentials, root_url, web_driver)

Authenticates the user with provided credentials.

Once you retrieve an instance of user’s Credentials, you can proceed further with authentication. In this example, an environmental variable USOS_SCRAPER_ROOT_URL is used as a root URL of the requested USOSweb application.

from usos.authentication import Authentication

...

auth = Authentication(
    credentials=john, 
    root_url=os.environ["USOS_SCRAPER_ROOT_URL"], 
    web_driver=selenium_driver)
Parameters:
  • credentials (object) – username and password in an instance of Credentials.
  • root_url (str) – root url of the USOSweb application the login procedure will be executed on.
  • web_driver (object) – an instance of usos.web_driver.SeleniumDriver responsible for controlling the browser.
is_authenticated()

Checks whether the user is authenticated.

Current implementation of this method firstly checks whether the user has been recently signed in, and if that condition is negative, attempts to execute sign_in().

if auth.is_authenticated():
    # Even though the user has not been signed in before,
    # the method will attempt to do it for him and if it 
    # succeeds, do_serious_stuff() will be executed.

    do_serious_stuff()
Return type:bool
Returns:False only if the authenication process has failed.
sign_in()

Performs the sign in procedure using a web_driver provided to the initializer.

Return type:bool
Returns:True if the procedure was successful.

Managing web drivers

class usos.web_driver.SeleniumDriver(headless, config={})

Provides a layer of abstraction to obtain a preconfigured object of a Selenium-compatible web driver.

Parameters:
  • headless (bool) – whether the driver should run in headless mode
  • config (dict) – set of config variables to tweak the behaviour of the web driver
exception_take_screenshot(codename)

Takes a screenshot of web driver’s current viewport.

This method can be utilized as a tool for troubleshooting non-trivial errors with parsing.

def perform-login-example(self, usr: str, pwd: str) -> object:
    try:
        ...
        return get_user_instance(usr, pwd)
    except:
        driver.exception_take_screenshot("perform-login")
        logging.exception("Could not retrieve user instance")
Parameters:codename (str) – name of the exception/event that will be added to the image’s filename
Return type:None
get_instance()

Returns an instance of selected web driver.

Changing the implementation of this method will allow you to integrate different web drivers and replace Chrome (the default) with Firefox, PhantomJS or something else.

To find out more, read Using custom web drivers.

Return type:object
Returns:by default - an object of ChromeDriver. Can be extended.
quit()

Forces the web driver to terminate.

Return type:None
reset()

Resets the instance of a web driver to None.

This method might prove useful while switching between different web drivers.

Return type:None

Retrieving data

class usos.scraper.Scraper(root_url, destinations, authentication, data_controller, web_driver)

Navigates the interface and scrapes the data.

Parameters:
  • root_url (str) – a root url for the USOSweb interface.
  • destinations (str) –
  • authentication (object) – an instance of usos.authentication.Authentication for accessing protected data.
  • data_controller (object) – a controller for storing and analysing scraped data.
  • web_driver (object) – a Selenium web driver instance for navigating.
go_to(destination)

Navigates to the provided destination.

Parameters:destination (str) – a part of the url that will be used to match the ScrapingTemplate.
Return type:None
quit()

Terminates the scraper.

Return type:None
run()

Runs the process of iterating through provided destinations.

Return type:None

Storing and analysing the data

class usos.data.DataController(dispatcher)

Stores and performs analysis of collected data.

from usos.data import DataController
from usos.notifications import Dispatcher

my_dispatcher = Dispatcher(
    channels="Email SMS WebPush MessengerPigeon",
    enable=True,
    config_file="notifications.json")

data = DataController(dispatcher=my_dispatcher)

single_entry = {
    'entity': 'example-entity-type',
    'items': [...]
}

data.upload(single_entry)        
...
data.analyze()
Parameters:dispatcher (object) – instance of usos.notifications.Dispatcher responsible for providing the notifications via available channels.
_compare(old, new)

Compares two entities between eachother.

Parameters:
  • old (dict) – locally stored entity.
  • new (dict) – newly retrieved entity.
Return type:

None

_compare_items(old, new, append_if_missing=False)

Compares two lists of items.

Parameters:
  • old (list) – items from an old entity.
  • new (list) – items from a new entity.
  • append_if_missing (bool) – whether the item should be added to the final results if it is present in new but missing in old.
Return type:

list

Returns:

items that share the same identifiers but with updated values.

_get_filename(data)

Returns a filename based on the data’s entity-type.

>>> grades = {
...     "entity": "final-grades",
...     "items": []
... }

>>> course_results = {
...     "entity": "course-results-tree",
...     "items": [
...         {
...             "group": "28-INF-S-DOLI",
...             "subgroup": "Logic for Computer Science",
...             "hierarchy": "/Exam",
...             "item": "Results",
...             "values": ["104.5 pkt"]
...         }
...     ]
... }

>>> data._get_filename(grades)
'data/final-grades.json'
>>> data._get_filename(course_results)
'data/courses/28-inf-s-doli.json'
Return type:str
Returns:filename of the json file for a given entity.
_load(filename)

Loads the data from a specified JSON file.

Parameters:filename (str) – name of the JSON file to load the data from.
Return type:dict
Returns:an entity retrieved from a file.
_same_item(old, new)

Checks whether a given new item carries the same identifiers as the old one.

>>> rectangle = {
...     "group": "Shapes",
...     "subgroup": "Two-dimensional",
...     "item": "A rectangle",
...     "values": [10, 20]
... }

>>> new_rectangle = {
...     "group": "Shapes",
...     "subgroup": "Two-dimensional",
...     "item": "A rectangle",
...     "values": [30, 50]
... }

>>> square = {
...     "group": "Shapes",
...     "subgroup": "Two-dimensional",
...     "item": "A square",
...     "values": [10, 10]
... }

>>> data._same_item(rectangle, square)
False
>>> data._same_item(rectangle, new_rectangle)
True
Parameters:
  • old (dict) – an element from the list of items of an old entity.
  • new (dict) – an element from the list of items of a new entity.
Return type:

bool

_save(filename, data)

Saves the data to a specified JSON file.

Parameters:
  • filename (str) – name of the JSON file to save the data to.
  • data (dict) – data to store.
Return type:

None

analyze()

Analyzes the data stored in the temporary storage and passes the results to the notifications’ dispatcher.

Return type:None
upload(item)

Uploads a given item to a temporary data storage.

The upload() method works on dictionaries structured as entities.

Learn more about entities here: Defining new entities.

data.upload({
    'entity': 'example-entity-type',
    'items': [...]
})
Parameters:item (dict) – item in an entity-compatible format.
Return type:None
upload_multiple(items)

Uploads a list of items to a temporary data storage.

Internally uses the upload() method on every item provided in the list.

my_final_grades = {
    'entity': 'final-grades',
    'items: [...]
}

entities = {}
for i in range(0, 15):
    entities.append({
        'entity': 'iterations-for-analysis'
        'items': [
            'group': 'Just a for loop',
            'item': 'A single iteration',
            'values': [i, i+1, i+2]    
        ]    
    })

entities = entities.append(my_final_grades)
data.upload_multiple(entities)
Parameters:items (list) – items in an entity-compatible format.
Return type:None
exception usos.data.NotAnEntity

An item is not in an entity-compatible format.

Dispatching notifications

class usos.notifications.Dispatcher(channels, enable, config_file)

Allows for sending multiple messages via configured channels.

Creating a new dispatcher that operates on selected channels:

from usos.notifications import Dispatcher

my_dispatcher = Dispatcher(
    channels="Email SMS WebPush MessengerPigeon",
    enable=True,
    config_file="notifications.json")
Parameters:
  • channels (str) – names of the channels separated by a single space.
  • enable (bool) – whether to allow the dispatcher to send any notifications.
  • config_file (str) – path to a file that contains channel-specific variables such as API Keys or special parameters.
send(data)

Sends notifications via channels set in the initializer.

Parameters:data (dict) – the data that will be sent.
Return type:bool
Returns:True if every notification has been sent successfuly on every channel.
send_single(channel, data)

Sends notifications via a single, given channel.

Parameters:
  • channel (str) – a name of the channel.
  • data (dict) – data that will be used to render the templates.
Return type:

bool

Returns:

True if the notifications have been sent successfuly on a given channel.

class usos.notifications.Notification(data, config={})

Provides a common layer of abstraction for every existing channel.

Use this class for inheritance while implementing custom streams:

class PaperMail(Notification):
    def _render(self) -> None:
        letter: str = "Hey, {name}! "
                      + "{message} "
                      + "Take care, {author}."  
       
        letter = letter.format(
            name=data["recipient"], 
            message=data["message"],
            author=data["sender"])
        
        self._rendered_template = letter

    def _send(self) -> bool:
        put_in_a_mailbox(self._rendered_template)
        return True

Now it can be used as a channel named PaperMail:

dispatcher = Dispatcher(
    channels="PaperMail",
    enable=True,
    config_file="mailbox_coordinates.json")

my_message = {
    "recipient": "Kate",
    "message": "I'm getting a divorce.",
    "sender": "Anthony"
}

dispatcher.send(my_message)

Read more at: Implementing additional Streams (channels).

Parameters:
  • data (dict) – data that will be used in the rendering of the final message.
  • config (dict) – variables that can be used for configuration purposes such as API Keys or custom parameters.
render()

Renders the template that will be sent in a notification.

For rendering the template, this method uses a private _render() method.

Return type:str
Returns:a rendered template.
render_and_send()

Renders the template and then sends a notification.

This method is an equivalent of calling render() and send() separately.

Return type:bool
Returns:True if a notification has been sent successfuly.
send()

Sends a notification if the rendered template isn’t empty.

For sending a notification, this method uses a private _send() method.

Return type:bool
Returns:True if a notification has been sent successfuly.
template_output()

Returns the output of a template for a channel.

This method will return an empty string if you try to retrieve a template without rendering it first.

>>> mail = PaperMail(my_message, {"DeliverOnTime": False})
>>> mail.template_output()
''
>>> mail.render()
>>> mail.template_output()
'Hey, Kate! I'm getting a divorce. Take care, Anthony.'
Return type:str
Returns:a rendered template.