Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

Related tags

Data Analysis damast
Overview

Damast

This repository contains code developed for the digital humanities project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World". The project was funded by the VolkswagenFoundation within the scope of the "Mixed Methods" initiative. The project was a collaboration between the Institute for Medieval History II of the Goethe University in Frankfurt/Main, Germany, and the Institute for Visualization and Interactive Systems at the University of Stuttgart, and took place there from 2018 to 2021.

The objective of this joint project was to develop a novel visualization approach in order to gain new insights on the multi-religious landscapes of the Middle East under Muslim rule during the Middle Ages (7th to 14th century). In particular, information on multi-religious communities were researched and made available in a database accessible through interactive visualization as well as through a pilot web-based geo-temporal multi-view system to analyze and compare information from multiple sources. A publicly explorable version of the research will soon be available, and will be linked here. An export of the data collected in the project can be found in the data repository of the University of Stuttgart (DaRUS) (draft, not yet public).

Database

The historical data is collected in a relational PostgreSQL database. For the project, we have used PostgreSQL version 10. Since the project also deals with geographical data, we additionally use the PostGIS extension. A suitable database setup is to use the postgis/postgis:10-3.1 Docker image. An SQL script for creating the database schema is located in util/postgres/schema.sql, and in DaRUS. An overview of the interplay between tables of the database, and a general explanation, can be found in the docs/ directory.

Software

The server is programmed in Python using Flask. Functionalities are split up into a hierarchy of Flask blueprints; for example, there is a blueprints for the landing page, one for the visualization, and a nested hierarchy of blueprints for the REST API. The server provides multiple pages, as well as a HTTP interface for reading from and writing to the PostgreSQL database. The server is built and deployed as a Docker container that contains all necessary dependencies.

An overview and explanation of the different pages and functionalities is located in the docs/ directory. The web pages consist of HTML, CSS and JavaScript. The HTML content is in most cases served via Jinja2 templates that are processed by Flask. The JavaScript code is compiled from TypeScript source, and the CSS is compiled from SCSS.

Getting Started

Basic knowledge with build tools and Docker are required. The instructions below assume a Linux machine with the Bash shell are used.

Installing Dependencies

On the build system, Docker and NodeJS need to be installed. If the Makefile is used, the build-essentials package is required as well. In the root of the repository, run the following code to install the build dependencies:

$ npm install

Building the Frontend

To build all required files for the frontend (JavaScript, CSS, documentation), the Makefile can be used, or consulted for the appropriate commands:

$ make prod

Building the Docker Image

All frontend content and backend Flask code and contents are bundled within a Docker image. In that image, the required software dependencies are also installed in the correct versions. A few configuration options need to be baked into the Docker image on creation, which are dependent on the setup in which the Docker container will later run. Please refer to the deploy.sh shell script for details and examples, as well as to the section on running the server. The Dockerfile is constructed from parts from util/docker/, and then enriched with runtime information to ensure that certain steps are repeated when data changes. An exemplary creation of a Docker image (fictional values, please refer to deploy.sh before copying) could look as follows:

Dockerfile # build Dockerfile (warning: dummy parameters!) $ sudo docker build -t damast:latest \ --build-arg=USER_ID=50 \ --build-arg=GROUP_ID=50 \ --build-arg=DHIMMIS_ENVIRONMENT=PRODUCTION \ --build-arg=DHIMMIS_VERSION=v1.0.0 \ --build-arg=DHIMMIS_PORT=8000 \ .">
# calculate hash of server files to determine if the COPY instruction should be repeated
$ fs_hash=$(find dhimmis -type f \
    | xargs sha1sum \
    | awk '{print $1}' \
    | sha1sum - \
    | awk '{print $1}')

# assemble Dockerfile
$ cat util/docker/{base,prod}.in \
    | sed "s/@REBUILD_HASH@/$fs_hash/g" \
    > Dockerfile

# build Dockerfile (warning: dummy parameters!)
$ sudo docker build -t damast:latest \
    --build-arg=USER_ID=50 \
    --build-arg=GROUP_ID=50 \
    --build-arg=DHIMMIS_ENVIRONMENT=PRODUCTION \
    --build-arg=DHIMMIS_VERSION=v1.0.0 \
    --build-arg=DHIMMIS_PORT=8000 \
    .

The resulting Docker image can then be transferred to the host machine, for example, using docker save and docker load. Of course, the image can be built directly on the host machine as well.

Running the Server

The server infrastructure consists of three components:

  1. The Flask server in its Docker container,
  2. the PostgreSQL database, for example in the form of a postgis/postgis:10-3.1 Docker container, and
  3. a reverse HTTP proxy on the host machine that handles traffic from the outside and SSL.

The util/ directory contains configuration templates for an NGINX reverse proxy, cron, the start script, the systemd configuration, and the user authentication file. The documentation also goes into more details about the setup. A directory on the host machine is mapped as a volume to the /data directory in the docker container. The /data directory contains runtime configuration files (users.db, reports.db, as well as log files). The main Docker container requires some additional runtime configuration, for example for the PostgreSQL password, which can be passed as environment variables to Docker using the --env and --env-file flags. The following configuration environment variables exist, although most have a sensible default:

Environment Variable Default Value Description
DHIMMIS_ENVIRONMENT Server environment (PRODUCTION, TESTING, or PYTEST). This decides with which PostgreSQL database to connect (ocn, testing, and pytest (on Docker container) respectively. This is usually set via the Docker image.
DHIMMIS_VERSION Software version. This is usually set via the Docker image.
DHIMMIS_USER_FILE /data/users.db Path to SQLite3 file with users, passwords, roles.
DHIMMIS_REPORT_FILE /data/reports.db File to which reports are stored during generation.
DHIMMIS_SECRET_FILE /dev/null File with JWT and app secret keys. These are randomly generated if not passed, but that is impractical for testing with hot reload (user sessions do not persist). For a production server, this should be empty.
DHIMMIS_PROXYCOUNT 1 How many reverse proxies the server is behind. This is necessary for proper HTTP redirection and cookie paths.
DHIMMIS_PROXYPREFIX / Reverse proxy prefix.
FLASK_ACCESS_LOG /data/access_log Path to access_log (for logging).
FLASK_ERROR_LOG /data/error_log Path to error_log (for logging).
DHIMMIS_PORT 8000 Port at which gunicorn serves the content. Note: This is set via the Dockerfile, and also only used in the Dockerfile.
PGHOST localhost PostgreSQL hostname.
PGPASSWORD PostgreSQL password. This is important to set and depends on how the database is set up.
PGPORT 5432 PostgreSQL port
PGUSER api PostgreSQL user
Comments
  • What data to include in Life Version?

    What data to include in Life Version?

    Initially opened by gs108488 in 211@TIK. History:

    gs108488 Dec 15, 2021:

    Similar to issue #172, we need to define which data is included in the Life Version?

    help wanted question 
    opened by mfranke93 40
  • Check texts in Help/Info menus

    Check texts in Help/Info menus

    Initially opened by gs108488 in 223@TIK. History:

    gs108488 Dec 16, 2021:

    All texts in info boxes when clicking on ? symbols in visualization need to be checked, especially in respect to publishing the life version.

    gs108488 Dec 16, 2021:

    I put this on halt until the state of the visualization for the life version is really final.

    documentation enhancement 
    opened by mfranke93 37
  •  Change text and appearence for start page of Life Version

    Change text and appearence for start page of Life Version

    Initially opened by gs108488 in 222@TIK. Relevant history:

    gs108488 Dec 16, 2021:

    For the Life Version, the start page needs a complete make over. Apart from "English only" (see ~~#210~~), the start page will need different sections. For the design, the styles inherited from the overall base.htm need to be considered.

    enhancement 
    opened by mfranke93 31
  • Explanation of

    Explanation of "Aggregation of religious groups"

    In the info text of the map, I find the following sentences:

    If in all shown map glyphs no more than four nodes of a lower part in the religion hierarchy would be present, the data is aggregated on that lower level. For example, if the map would only show two glyphs; one with the Latin Church, the Coptic Church, and the Georgian Orthodox Church; and the other with the Latin Church and the Rabbanites; each of these religious denominations could be represented by an individual circle. This will only happen if it is possible in all glyphs, as doing otherwise would skew the perceived variety of religions.

    I hardly understand this. Is there something wrong, e.g. with the first phrase in the <em> tag?

    help wanted question 
    opened by tutebatti 23
  • Include section

    Include section "How to Cite" in Report

    • Where would I suggest what the section needs to look like? This is not as easy compared to changing the html of the info texts, is it?
    • What is more, I am not sure who the "authors" of a report are when it comes to citation. Probably, instead of some standard bibliographical data, it should be something like:

    We suggest to cite this report in the following way: "Report ###UUID### created by ###YOUR NAME### based on the visualization and data of Weltecke, Dorothea et al. "Dhimmis and Muslims. A Tool [...]" accessible via ###URI###."

    enhancement 
    opened by tutebatti 20
  • Problems with cookie dialogue in Firefox?

    Problems with cookie dialogue in Firefox?

    Yesterday, someone from our team in Berlin tried visiting the public instance at damast.geschichte.hu-berlin.de with Firefox (on Windows) and the "Accept" button in the cookie dialogue did not respond. On my PC, with Firefox (on Linux) I got a grayed-out start page but not dialogue at all.

    Happy about any recommendations to track these bugs.

    bug 
    opened by tutebatti 19
  • Example when explaining regular expressions for

    Example when explaining regular expressions for "Place search"

    In the current example in the info text for the search of places, one reads:

    The search field supports JavaScript-style regular expressions. For example, to search for locations with an Arabic definite article, the query \ba([tdrzsṣḍṭẓln]|[tds]h)- can be used.

    If I understand correctly from the list of places, we do not use the DMG notation for Arabic articles (cf. https://de.wikipedia.org/wiki/DIN_31635). That example makes little sense, then. Any better suggestions. @rpbarczok, you probably no the data itself better than @mfranke93?

    help wanted discussion 
    opened by tutebatti 16
  • "NULL" vs. empty in column "comment" of Place table

    I'm currently looking through the Place table, especially to check for major inconsistencies in the column comment, which is part of the Place URI page. On the one hand, there were 3 or for rows with linebreaks. On the other hand, some rows have nothing in the column comment, some have NULL. Is that normal?

    Note: I am editing a csv downloaded via pgAdmin in LibreOffice Calc.

    help wanted question 
    opened by tutebatti 15
  • Listing pieces of evidences in place URI pages?

    Listing pieces of evidences in place URI pages?

    Connected to #144 and #146, a reason to generate reports is that it is the only way to have access to the detailed pieces of evidences. The visitor seemingly wanted to create reports to read about the different pieces of evidence of a specific place. What would you say, @rpbarczok?

    opened by tutebatti 12
  • Access to place URI page from map

    Access to place URI page from map

    Another comment by the same visitor giving feedback reported in #144:

    It would be great to access the place URI page directly from the map.

    Right now, the "mechanics" are either hovering displaying a tooltip. One can probably not use that as a link to the place URI page as it vanishes when moving the mouse. Clicking on the place (or the glyph) brushs and links, which then allows to go to the place URI page from the location list. So there is probably no easy solution. Maybe one could open the same popup as with the hovering with a right-click and make the tool tip persistent? Just an idea.

    enhancement 
    opened by tutebatti 12
  • Changes on the automatically generated report

    Changes on the automatically generated report

    I looked through the record and I like to to suggest the following improvements:

    1.) Query report: The first three filters of the query report “evidence must be visible”, “place must be visible” and “place type must be visible” is to my knowledge not part of the filter possibilities of the visualization. Someone has to know the data structure quite well to understand these statements. Therefore, I believe the users will be confused by them. Additionally, to my knowledge, no data in the DhiMu project is hidden. Therefore, I would like these statements to be removed if there are no other objections.

    2.) Evidence a) I would rather have the evidences sorted by a) city name, b) religion name, c) start of the time span b) In the English lexicography there is not plural form of Evidence. An English speaker would rather use the term “Pieces of Evidence”. I think we have to change that. c) The evidence fragment of the report distinguishes between hidden and visible data. Is there a case when hidden data is shown in the report? To my knowledge hidden data is not part of the publicly available filtering system? In any case: No hidden evidences, places or region should be part of the DhiMu-data, so it does not need to be included in the report. d) I would like to move the footnote with the evidence comment to the end of the sentence. e) We should use “…religious group…” instead of Religion f) If I understand the source code correctly, a place with the category “unknown” is displayed as “place”. Maybe it would be better to add an “undefined” to it: “undefined place”. What do you think? g) I would rather have a colon between the short title of the source and the content of the source_instance.source_page cell. h) I would like to move the source comment into a footnote at the end of the sentence.

    3.) Places a) According the place type "unknown", I suggest to add "undefined" to "place" as above. b) I would like to have the list items in the „Linked to“ section in the following way:

    <place.name> is linked to: Digital Atlas of the Roman Empire (DARE): http://imperium.ahlfeldt.se/places/22329

    i.e. without the short title and the ID at the beginning.

    @tutebatti, @mfranke93: what do you think?

    enhancement discussion 
    opened by rpbarczok 12
Releases(v1.1.5+history)
Owner
University of Stuttgart Visualization Research Center
University of Stuttgart Visualization Research Center
Techdegree Data Analysis Project 2

Basketball Team Stats Tool In this project you will be writing a program that reads from the "constants" data (PLAYERS and TEAMS) in constants.py. Thi

null 2 Oct 23, 2021
A Big Data ETL project in PySpark on the historical NYC Taxi Rides data

Processing NYC Taxi Data using PySpark ETL pipeline Description This is an project to extract, transform, and load large amount of data from NYC Taxi

Unnikrishnan 2 Dec 12, 2021
Python Project on Pro Data Analysis Track

Udacity-BikeShare-Project: Python Project on Pro Data Analysis Track Basic Data Exploration with pandas on Bikeshare Data Basic Udacity project using

Belal Mohammed 0 Nov 10, 2021
In this project, ETL pipeline is build on data warehouse hosted on AWS Redshift.

ETL Pipeline for AWS Project Description In this project, ETL pipeline is build on data warehouse hosted on AWS Redshift. The data is loaded from S3 t

Mobeen Ahmed 1 Nov 1, 2021
A real data analysis and modeling project - restaurant inspections

A real data analysis and modeling project - restaurant inspections Jafar Pourbemany 9/27/2021 This project represents data analysis and modeling of re

Jafar Pourbemany 2 Aug 21, 2022
Kennedy Institute of Rheumatology University of Oxford Project November 2019

TradingBot6M Kennedy Institute of Rheumatology University of Oxford Project November 2019 Run Change api.txt to binance api key: https://www.binance.c

Kannan SAR 2 Nov 16, 2021
An experimental project I'm undertaking for the sole purpose of increasing my Python knowledge

5ePy is an experimental project I'm undertaking for the sole purpose of increasing my Python knowledge. #Goals Goal: Create a working, albeit lightwei

Hayden Covington 1 Nov 24, 2021
Project under the certification "Data Analysis with Python" on FreeCodeCamp

Sea Level Predictor Assignment You will anaylize a dataset of the global average sea level change since 1880. You will use the data to predict the sea

Bhavya Gopal 3 Jan 31, 2022
MS in Data Science capstone project. Studying attacks on autonomous vehicles.

Surveying Attack Models for CAVs Guide to Installing CARLA and Collecting Data Our project focuses on surveying attack models for Connveced Autonomous

Isabela Caetano 1 Dec 9, 2021
My first Python project is a simple Mad Libs program.

Python CLI Mad Libs Game My first Python project is a simple Mad Libs program. Mad Libs is a phrasal template word game created by Leonard Stern and R

Carson Johnson 1 Dec 10, 2021
An ETL framework + Monitoring UI/API (experimental project for learning purposes)

Fastlane An ETL framework for building pipelines, and Flask based web API/UI for monitoring pipelines. Project structure fastlane |- fastlane: (ETL fr

Dan Katz 2 Jan 6, 2022
Pyspark project that able to do joins on the spark data frames.

SPARK JOINS This project is to perform inner, all outer joins and semi joins. create_df.py: load_data.py : helps to put data into Spark data frames. d

Joshua 1 Dec 14, 2021
A Streamlit web-app for a data-science project that aims to evaluate if the answer to a question is helpful.

How useful is the aswer? A Streamlit web-app for a data-science project that aims to evaluate if the answer to a question is helpful. If you want to l

null 1 Dec 17, 2021
Repository created with LinkedIn profile analysis project done

EN/en Repository created with LinkedIn profile analysis project done. The datase

Mayara Canaver 4 Aug 6, 2022
This mini project showcase how to build and debug Apache Spark application using Python

Spark app can't be debugged using normal procedure. This mini project showcase how to build and debug Apache Spark application using Python programming language. There are also options to run Spark application on Spark container

Denny Imanuel 1 Dec 29, 2021
A project consists in a set of assignements corresponding to a BI process: data integration, construction of an OLAP cube, qurying of a OPLAP cube and reporting.

TennisBusinessIntelligenceProject - A project consists in a set of assignements corresponding to a BI process: data integration, construction of an OLAP cube, qurying of a OPLAP cube and reporting.

carlo paladino 1 Jan 2, 2022
Project: Netflix Data Analysis and Visualization with Python

Project: Netflix Data Analysis and Visualization with Python Table of Contents General Info Installation Demo Usage and Main Functionalities Contribut

Kathrin Hälbich 2 Feb 13, 2022
Incubator for useful bioinformatics code, primarily in Python and R

Collection of useful code related to biological analysis. Much of this is discussed with examples at Blue collar bioinformatics. All code, images and

Brad Chapman 560 Jan 3, 2023
fds is a tool for Data Scientists made by DAGsHub to version control data and code at once.

Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc

DAGsHub 359 Dec 22, 2022