This is a Docker-based pipeline for preparing sextractor-ready multiwavelength images

Overview

Pipeline for creating NB422-detected (ODI) catalog

The repository contains a Docker-based pipeline for preprocessing observational data. The pipeline creates a "master catalog" that combines sextractor catalogs from the input of NB422 (ODI) and other broadband images in COSMOS field. The pipeline is fully automatic except during certain tasks in the pipeline that require user input (e.g. IRAF msccmatch, imexamine and sextractor tasks).

Features:

  • run environment is independent of local machine
  • only re-run necessary steps after changing input data and parameters
  • automatically analyze stdout from IRAF iterstat and imexamine tasks and use result for the following step
  • suuport multiple host operating systems

The pipeline is not meant to be run as is but should be modified to suit specific data analysis routine.

license

Licensed under the MIT license; see LICENSE

Software used in the example

SAOImageDS9, Python/AstroPy, IRAF/PyRaf, IDL, SExtractor

Prerequisites

  • Docker
  • IDL, Python3 (if running the IDL-based generate_kernel task is desired)

Instructions

  1. prepare mosaic images and name them as NB422.fits, g.fits etc under a folder (rms images g.rms.fits); Download coordinate file from Gaia server to the same folder and name it as gaia.coo
  2. make any necessary changes to adapt the code to your workflow
  3. edit docker-compose.yml and change /path/to/data_folder to the actual location of the data folder on the host machine
  4. (steps 4-6 apply only if you want to run the generate_kernel task) open a new terminal, run export REQUEST_PORT=8088 and export RESPONSE_PORT=8089 (or any other port that you choose) in both terminals
  5. copy server.py and generate_kernel.pro to the data folder
  6. run python server.py
  7. Back in the repository, run docker build -t odi_pipeline . to build the image
  8. After the image is built successfully, run docker-compose run --service-ports pipeline to create a container
  9. inside the container, run python3 -m py_programs.tasks.runner create_makefile
  10. go to /mnt/data folder and run make master_catalog.csv
    • caution: for Mac and Windows users, check Addtional Notes

Brief explanation of data processing in the pipeline

  1. calibrate astrometry using gaia.coo, use IRAF msccmatch task
  2. copy the new wcs in the mosaic image headers to their rms images, use IRAF wcscopy
  3. (steps 3-4 apply only to broadband images) reproject the broadband images to the same tangent point and pixel scale of NB422.fits, use IRAF wregister
    • caution: turn on flux_conserve when dealing with the mosaic images; no need for rms maps
  4. match the reprojected rms map to the sky noise of the reprojected mosaic image, use IRAF iterstat
  5. make flag map of NB image, use Python
  6. measure the image PSFs, use IRAF imexamine
  7. make moffat PSFs, use Python
  8. generate kernels, that transform original (broadband) PSFs to NB PSF, use IDL max_entropy
  9. convolve broadband images, make sure all images have the same PSF, use Python
  10. run sextractor, make a master catalog
  11. All done!

Additional notes

  • IDL is required for creating kernels bb_to_nb.fits. The command is run on the host machine to avoid the complexity of installing IDL and setting up the license. The host communicates with the dock container via a basic TCP connection.
  • The pipeline has been thoroughly tested on a 64-bit Ubuntu host.
  • touch command is used after some IRAF/PyRAF tasks because IRAF changes the modification time of input files in an unexpected way, which makes the timestamp-based make system unusable

Operation system support

  • For both Windows and Mac systems:
    • edit docker-compose.yml: change environment: DISPLAY=host.docker.internal:0.
    • edit server.py (line:29) and py_programs/func/generate_kernel.py (line:8): change 127.0.0.1 to host.docker.internal.
  • Windows:
  • Mac:
    • must have XQuartz installed
    • Launch XQuartz. Under the XQuartz menu, select Preferences
    • Go to the security tab and ensure "Allow connections from network clients" is checked.
    • Run xhost + ${hostname} to allow connections to the macOS host
    • For Mac with M1 processors, run export DOCKER_DEFAULT_PLATFORM=linux/amd64 before docker build
You might also like...
Ingestinator is my personal VFX pipeline tool for ingesting folders containing frame sequences that have been pulled and downloaded to a local folder

Ingestinator Ingestinator is my personal VFX pipeline tool for ingesting folders containing frame sequences that have been pulled and downloaded to a

Project repository of Apache Airflow, deployed on Docker in Amazon EC2 via GitLab.

Airflow on Docker in EC2 + GitLab's CI/CD Personal project for simple data pipeline using Airflow. Airflow will be installed inside Docker container,

An example using debezium and mysql with docker-compose

debezium-mysql An example using debezium and mysql with docker-compose The docker compose starts the Zookeeper, Kafka, Mysql and Debezium Connect. Aft

Like Docker, but for Squeak. You know, for kids.
Like Docker, but for Squeak. You know, for kids.

Squeaker Like Docker, but for Smalltalk images. You know, for kids. It's a small program that helps in automated derivation of configured Smalltalk im

EFB Docker image with efb-telegram-master and efb-wechat-slave

efb-wechat-docker EFB Docker image with efb-telegram-master and efb-wechat-slave Features Container run by non-root user. Support add environment vari

SQL centered, docker process running game

REQUIREMENTS Linux Docker Python/bash set up image "docker build -t game ." create db container "run my_whatever/game_docker/pdb create" # creating po

A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!

Streamify A data pipeline with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more! Description Objective The project will stre

A python script developed to process Windows memory images based on triage type.

Overview A python script developed to process Windows memory images based on triage type. Requirements Python3 Bulk Extractor Volatility2 with Communi

Multi View Stereo on Internet Images
Multi View Stereo on Internet Images

Evaluating MVS in a CPC Scenario This repository contains the set of artficats used for the ENGN8601/8602 research project. The thesis emphasizes on t

Owner
null
A docker container (Docker Desktop) for a simple python Web app few unit tested

Short web app using Flask, tested with unittest on making massive requests, responses of the website, containerized

Omar 1 Dec 13, 2021
A minimalist production ready plugin system

pluggy - A minimalist production ready plugin system This is the core framework used by the pytest, tox, and devpi projects. Please read the docs to l

pytest-dev 876 Jan 5, 2023
WhyNotWin11 - Detection Script to help identify why your PC isn't Windows 11 Release Ready

WhyNotWin11 - Detection Script to help identify why your PC isn't Windows 11 Release Ready

Robert C. Maehl 5.9k Dec 31, 2022
Woltcheck - Python script to check if a wolt restaurant is ready to deliver to your location

woltcheck Python script to check if a wolt restaurant is ready to deliver to you

null 30 Sep 13, 2022
The docker-based Open edX distribution designed for peace of mind

Tutor: the docker-based Open edX distribution designed for peace of mind Tutor is a docker-based Open edX distribution, both for production and local

Overhang.IO 696 Dec 31, 2022
pydock - Docker-based environment manager for Python

pydock - Docker-based environment manager for Python ⚠️ pydock is still in beta mode, and very unstable. It is not recommended for anything serious. p

Alejandro Piad 16 Sep 18, 2021
Socorro is the Mozilla crash ingestion pipeline. It accepts and processes Breakpad-style crash reports. It provides analysis tools.

Socorro Socorro is a Mozilla-centric ingestion pipeline and analysis tools for crash reports using the Breakpad libraries. Support This is a Mozilla-s

Mozilla Services 552 Dec 19, 2022
Procedural 3D data generation pipeline for architecture

Synthetic Dataset Generator Authors: Stanislava Fedorova Alberto Tono Meher Shashwat Nigam Jiayao Zhang Amirhossein Ahmadnia Cecilia bolognesi Dominik

Computational Design Institute 49 Nov 25, 2022
Pokemon catch events project to demonstrate data pipeline on AWS

Pokemon Catches Data Pipeline This is a sample project to practice end-to-end data project; Terraform is used to deploy infrastructure; Kafka is the t

Vitor Carra 4 Sep 3, 2021
This is a practice on Airflow, which is building virtual env, installing Airflow and constructing data pipeline (DAGs)

airflow-test This is a practice on Airflow, which is Builing virtualbox env and setting Airflow on that env Installing Airflow using python virtual en

Jaeyoung 1 Nov 1, 2021