# SSO Privacy Evaluation Readme

# Preconditions

* Jupyter Notebook >=7.2.2 (`pip3 install "notebook>=7.2.2"`)
* Dependencies (`pip3 install "ipywidgets>=8.1.5" "tldextract>=5.1.2" "pandas>=2.2.3" "matplotlib>=3.9.2" "haralyzer>=2.4.0"`):
  * ipywidgets >= 8.1.5
  * tldextract >= 5.1.2
  * pandas >= 2.2.3
  * matplotlib >= 3.9.2
  * haralyzer >= 2.4.0

# Partial Leak Evaluation

* To run the evaluation download the evaluation script `Partial-Leak-Evaluation.ipynb` (see [here](https://files.sso-privacy-leak.info/share/W6rrqEnb/2-Evaluation/Evaluation%20Scripts/Partial-Leak-Evaluation.ipynb)) and run the command `jupyter notebook` inside the directory.
* Open the file `Partial-Leak-Evaluation.ipnb` in the jupyter browser and change the variable `sets_to_analyse` to the path where your scan results from the partial leak scan(s) are stored. This can be your own scanning result files or the results from the paper (see [here](https://files.sso-privacy-leak.info/share/W6rrqEnb/1-Scanning/Partial-Leak-Scans/partial-leak-scan-data-top-1m.zip)).
* Now you can run all cells. After the last one, the file `exported_lreq.csv` will be exported to the `/tmp` directory.

### Cell results
The data of each cell is described below. Not listed cells handle the setup of the environment, the data analyses which do not generate any results, and the export of the results into the result files.

#### Cell 4
Cell 4 produces numbers for Table 1 (T1) and Table 4 (T4) of the paper: 
- Total found login requests (T1)
- Unique requests (§5.1)
- Count and type of sites, that leak for multiple providers (§5.1) 
- Leak statistics for each IdP (T1 & §5.1)
- Count and statistic data for errors that happened while scanning for paritial leaks (T4)

#### Cell 5
Cell 5 produces the raw numbers for Figure 3. The percentage values were calculated manually.

# Full & Escalation Leak Evaluation

* To run the evaluation download the evaluation script `Full-Leak-Evaluation.ipynb` (see [here](https://files.sso-privacy-leak.info/share/W6rrqEnb/2-Evaluation/Evaluation%20Scripts/Full-Leak-Evaluation.ipynb)) and run (if not already running) the command `jupyter notebook` inside the directory.
* Open the file `Full-Leak-Evaluation.ipynb` in the jupyter browser and change the variable `sets_to_analyse` to the path where your scan results from the full leak scan(s) are stored. As alternative you can use our scan results that are found [here](https://files.sso-privacy-leak.info/share/W6rrqEnb/1-Scanning/Full-Leak-Scans/). Please use the zip files for downloading all results for a specific identity provider.
* The results are exported to the `tmp` directory.

### Cell results
The data of each cell is described below. Not listed cells handle the setup of the environment, the data analyses which do not generate any results, and the export of the results into the result files.

#### Cell 3
Cell 3 shows all found full leaks for each IdPs (T1). False positives are excluded and the leaks are categorized.  

#### Cell 5
Cell 5 analyses FedCM requests (§6.3)

#### Cell 7
Cell 7 lists all found escalated leaks (T1)
