API & Webapp to answer questions about COVID-19. Using NLP (Question Answering) and trusted data sources.

Overview

cover-photo

This open source project serves two purposes.

  1. Collection and evaluation of a Question Answering dataset to improve existing QA/search methods - COVID-QA
  2. Question matching capabilities: Provide trustworthy answers to questions about COVID-19 via NLP - outdated

COVID-QA

Update 14th April, 2020: We are open sourcing the first batch of SQuAD style question answering annotations. Thanks to Tony Reina for managing the process and the many professional annotators who spend valuable time looking through Covid related research papers.

FAQ matching

Update 17th June, 2020: As the pandemic is thankfully slowing down and other information sources have catched up, we decided to take our hosted API and UI offline. We will keep the repository here as an inspiration for other projects and to share the COVID-QA dataset.

⚡ Problem

  • People have many questions about COVID-19
  • Answers are scattered on different websites
  • Finding the right answers takes a lot of time
  • Trustworthiness of answers is hard to judge
  • Many answers get outdated soon

đź’ˇ Idea

  • Aggregate FAQs and texts from trustworthy data sources (WHO, CDC ...)
  • Provide a UI where people can ask questions
  • Use NLP to match incoming questions of users with meaningful answers
  • Users can provide feedback about answers to improve the NLP model and flag outdated or wrong answers
  • Display most common queries without good answers to guide data collection and model improvements

⚙️ Tech

  • Scrapers to collect data
  • Elasticsearch to store texts, FAQs, embeddings
  • NLP Models implemented via Haystack to find answers via a) detecting similar question in FAQs b) detect answers in free texts (extractive QA)
  • React Frontend
Comments
  • Feature : Matched Question and Feedback Option

    Feature : Matched Question and Feedback Option

    Thanks again for creating this PR. Great work!

    Two comments:

    • Right now the answer displayed by the bot doesn't contain the "matched question". However, it might be helpful for the users to see that in order to judge if the answer is really relevant to their question. You find it in the response JSON in the field "question".
    • We now also have the option for user feedback (see API endpoint). So people can rate if the given answer was helpful or not and we will use the data to improve the NLP model. This could be also an helpful addition to the telegram bot.

    Would be great to hear your thoughts on that and maybe address them in a separate PR.

    Originally posted by @tholor in https://github.com/deepset-ai/COVID-QA/pull/58#issuecomment-602513354

    opened by theapache64 11
  • Question : API

    Question : API

    I've been developing a telegram bot using the API. Currently am using https://covid-middleware.deepset.ai/api/bert/question to get the answers.

    curl -X POST \
      https://covid-middleware.deepset.ai/api/bert/question \
      -H 'content-type: application/json' \
      -d '{
    	"question":"community spread?"
    }'
    

    but the swagger doesn't list this API and shows a different one with different structures.

    So, my question is, Which API should I choose to get the answers? @tanaysoni

    opened by theapache64 10
  • Document Retrieval for extractive QA with COVID-QA

    Document Retrieval for extractive QA with COVID-QA

    Thank you so much for sharing your data and tools! I am working with the question-answering dataset for an experiment of my own.

    @Timoeller mentioned in #103 that the documents used in the annotation tool to create the COVID-QA.json dataset "are a subset of CORD-19 papers that annotators deemed related to Covid." I was wondering if these are the same documents as listed in faq_covidbert.csv.

    The reason I ask is that, as a workaround I've created my own retrieval txt file(s) through extracting the answers from COVID-QA.json, but the results are hit or miss. They are particularly off if I break the file up into chunks to improve performance, for instance into a separate txt file for each answer. I'm assuming this is due to lost context. I'm wondering if I should simply be using faq_covidbert as illustrated here, even though I am using extractive-QA.

    The reason I did my method is that I was trying to follow an approach most closely approximating the extractive QA tutorial.

    My ultimate objective is to compare the experience of using extractive QA vs FAQ-style QA, so I presumed that it would be apropos to have a bit of separation in the doc storage dataset.

    Thank you!

    opened by aaronbriel 6
  • Integrate SIL language identification API

    Integrate SIL language identification API

    This PR integrates more inclusive language identification as compared to cld2/3. To this end, the SIL Language Identification API is used as the default language identification model. This API supports 1035 languages currently including many lower resourced languages, and hopefully by using this language ID COVID-QA can start leveraging a variety of data gathered in lowered resourced languages (e.g., via elasticsearch). Some sources of this info are the SIL COVID resources, the endangered languages project, and this repo.

    Important points

    • I've integrated new environmental variables for the API key, secret and url in config.py. These can be obtained for free at developers.sil.org
    • I took cld2/3 out of the loop because closely related languages (e.g., russian and bulgarian) might be misidentified. cld2/3 doesn't support many languages and is thus biased in terms of the data that was used to train the model.
    • The SIL API returns ISO 639-3 codes only (3 letter language codes)

    How to test

    $ cd covid_nlp/language/
    $ python detect_language.py
    

    Other suggestions/ notes:

    • Right now any non-English language is routed to the FAQ-based matching with elasticsearch (I think). It would be great to upgrade this a bit to determine which languages currently have data. Then for languages without any content, we could route those to something like PanLex resources or SIL posters for low resource languages.
    • I would like to upgrade the SIL API to support language ID for many samples at once. This may help in dynamically determining what content we have in the datastore.

    Supported languages in the API

    Supported via Text Classification: biv, cub, cmo, hvn, kus, yal, pwg, myv, guo, des, leu, eip, cso, zia, kri, mca, kno, zza, maz, bps, qub, rmy, lvs, tab, nld, moa, ssg, maw, pww, sab, udm, zsm, zao, dzo, gnw, bru, kog, cwe, bim, tgo, mlh, blz, ckt, lok, smo, kpq, eng, nnq, kmr, pir, cab, tuo, bvc, xte, txu, pny, klv, jic, khm, mhy, yli, kha, dop, ojb, gvl, meq, cof, qvo, kqe, btd, bwu, nii, arb, xtn, top, lex, lob, cjp, por, ote, tmc, sun, grt, mcd, sja, naw, plw, zas, soq, khq, cek, ozm, kud, ted, bmq, pan, rjs, ktb, nhw, krj, ycn, ita, tir, prg, kpf, qup, msy, emp, ncu, qxo, ell, mzk, tim, yaz, dtb, upv, cou, noa, nhy, adh, cly, saj, fuq, rmo, gla, sim, apd, kpr, ota, kqp, gso, afr, kxc, mbt, wiu, pbb, cor, qul, gwr, twu, qve, arl, bku, alz, mto, bak, guc, lat, kgr, agm, cwt, iws, mip, ctp, khz, kyq, vie, dad, dug, yas, irk, kez, mza, nou, yue, law, kur, atg, mco, acr, lhu, myb, tik, djk, hae, tpp, yuj, mwq, rav, kzj, tuk, pbi, ffm, kmo, ybb, bgz, slk, cbv, gof, bjz, jiv, lln, xrb, cjo, qxn, prk, cot, xed, dgi, nsn, mpm, bzj, kne, cnl, bhw, gyr, akh, ntp, pls, aoz, som, tlf, xsb, eus, mfe, hak, aby, mej, myw, dsb, kru, snw, tpt, cle, nyy, tgp, agd, btt, mf1, quz, swg, sck, dyo, qvc, due, mmo, nca, oss, urt, hrv, btx, ban, pib, iri, sba, kub, lif, npl, icr, mbh, amu, sag, zpt, pss, gle, azz, hag, lzh, acu, ara, hns, zpq, mio, zty, cuc, usa, dan, miq, akb, nyo, cbi, caa, gdn, pms, mpt, wer, teo, ghs, mxt, fin, mjv, kwd, cax, zpl, ntr, ake, nog, tlj, aah, ach, mit, fij, apz, ceb, gde, gdr, mcp, cui, twb, mta, ncj, ino, men, mhi, mir, pez, quy, yre, asm, bdd, zpm, hot, zpi, kao, kyu, mvc, zpz, nzi, stp, srp, dik, guk, hat, zca, opm, aso, way, uig, krs, dig, sbl, glg, ava, avk, mkd, con, jac, mbb, heb, ces, mwv, wob, ddn, fuf, jbu, chr, kms, kwi, soy, qvn, rap, sxn, sgw, rel, ukr, gnd, bgt, thk, nob, dga, mie, orv, kyz, guh, pag, pse, tfr, cul, bhl, xsr, vag, qvw, nst, azg, muv, pad, cco, ese, gcf, pol, akp, sey, bex, vut, pam, lus, gvc, vol, stn, kdc, gym, med, wuv, gng, pui, kle, arz, myx, aak, hif, ian, sig, ign, mvp, xuo, kup, bbr, amf, zai, cya, nia, raw, nyf, ayp, czt, saq, zae, sah, kzf, swe, jam, poi, dob, hnn, mhr, okv, aze, gor, nij, aai, mkl, ron, isl, cpb, mup, nod, sus, knf, laj, nnb, tqo, bfd, cok, alj, pcm, kpw, myk, bbo, uvl, jbo, kia, kat, mux, agn, bjv, tly, mak, ixi, spp, xtd, ifu, urd, bom, bel, ruf, mhl, kek, bts, nhe, duo, mfz, otq, trs, old, bus, dbq, tcc, bba, cat, tee, cfm, bef, nwb, tca, dgz, cnk, crn, dah, chv, kwf, aom, bcl, nfr, fal, tpw, gos, crh, tnr, deu, yuw, oku, hoc, luc, rim, zar, ndy, pbc, udu, daa, miy, mog, obo, aia, knk, sgb, kbh, aoj, gaw, jvn, hsb, ljp, rnl, acc, avt, kbm, sbd, nhi, itv, yle, kbp, mzm, ame, amk, srn, ido, mqj, acm, box, xla, gag, tem, ses, boa, lmk, ker, bov, lew, bul, gbo, bmv, agu, aau, kkj, smt, ziw, ind, ter, hla, xsu, lef, qwh, zpu, xal, adj, gux, rus, ztq, kij, lgg, alp, frd, agr, miz, nin, mfq, gmv, urb, bpr, hye, boj, bua, wnu, naf, tgl, acd, sgz, lsm, yat, ton, for, fuv, wwa, tue, atq, iry, kyc, rai, pab, grn, hus, tav, lao, sda, tat, ilo, ury, nyn, lis, nkf, mtp, mxb, waj, kpz, aeu, krc, rwo, tbo, mai, avn, npy, vid, wba, mox, sne, yaa, hun, ben, mhx, viv, bav, vun, tuf, gur, cmr, cgc, sld, aon, ttc, ura, wap, dyi, gwi, ann, kue, quw, cbc, mfy, mtj, mya, mti, mgo, ppk, tac, est, ngp, pkb, zpo, cap, zab, fuh, tbc, dos, mag, mcu, xmm, cmn, mil, mww, apr, big, cdf, gvf, mda, lad, cnt, ipi, bon, kki, mqb, gum, kab, ctd, cme, ong, taj, usp, tpz, moz, ina, kvn, quf, thv, mlp, hin, sps, eka, bmr, sdm, mop, ubu, bnj, lem, gog, kbr, ahk, enb, gej, mif, uzb, ixl, dtp, yid, mnb, mpg, bss, ccp, muy, kto, avu, tos, dww, car, qvm, neb, csk, yrb, amn, jun, imo, nmz, gbi, maa, snc, lip, jpn, nak, bkv, awb, iba, mqf, tvw, xsm, cym, cuk, guu, mxv, nan, coe, mgh, msm, fra, sue, amr, rom, gai, kcg, mur, ctg, nlc, nch, yss, gfk, bos, myy, mxq, kpv, dnw, lac, shn, taq, tna, thl, kdj, spa, jav, anv, atb, cbs, ken, yam, asg, spl, zaw, gah, tnn, alt, enq, sqi, mib, yad, zyp, lit, sur, ife, ktj, ifk, nsu, abt, hne, rro, zpc, mfi, gun, hil, run, qxh, lia, dts, lee, ltz, mzw, pis, epo, ptp, tzj, chz, nim, pes, tzt, ngu, mor, pao, wmw, dsh, bwq, sny, zaa, ber, yut, keo, faa, kxm, ndz, arq, not, cko, ceg, dgk, gqr, tlb, bxr, kaq, mnf, jmc, mar, muh, inb, knj, tha, prf, mon, nnw, nhu, mfh, bcw, bre, kmd, acn, quc, hig, pah, sri, bfo, ade, bgr, rmc, cjv, auy, amh, war, guq, bud, tby, tlh, hrx, pmf, kwj, awa, mee, vmy, cpa, heh, dwr, cni, gui, kje, sas, srm, wuu, buk, lgl, xav, kyf, lwo, mal, fai, far, lww, oci, blt, pau, hto, cbr, abi, mbc, mim, rej, sml, min, yby, nno, lfn, roo, kor, yva, toc, tnk, knv, bvz, nds, gna, nhx, nuj, kjh, urk, gub, amm, nho, huu, qvz, mva, ile, grc, bao, mfk, sil, cbk, sll, snn, mcq, mek, slv, ksr, qvs, kaz, kqy, bkl, bib, tur, yml, suk, kaa, huv, krl, bmh, kze, csb, ape, ppo, ttr, ndj, hub, tte, ess, zos, nvm

    Supported via rule-based methods (based on unicode blocks and writing system scripts): xsr, ind, cmo, lif, ron, ojb, nod, mww, men, jun, rus, btd, jpn, hil, arb, pol, run, mai, alt, kyu, amh, taj, war, nld, pam, ljp, bud, lus, grt, mak, sun, akb, dzo, bku, urd, bru, tgl, pag, som, kbp, arz, pan, bts, vie, sas, gag, kxm, mjv, taq, fuf, ita, chr, bul, mya, jav, atb, blt, ceb, ccp, bcl, lao, ilo, mar, oss, hnn, btx, rej, ban, lis

    opened by dwhitena 6
  • Docker build succeeds, docker run fails with elasticsearch error

    Docker build succeeds, docker run fails with elasticsearch error

    I followed the instructions here https://github.com/deepset-ai/COVID-QA/tree/master/backend. Perhaps a port is not configured correctly?

    INFO:     initializing identifier
    WARNING:  PUT http://localhost:9200/document [status:N/A request:0.004s]
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 157, in _new_conn
        (self._dns_host, self.port), self.timeout, **extra_kw
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 84, in create_connection
        raise err
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 74, in create_connection
        sock.connect(sa)
    ConnectionRefusedError: [Errno 111] Connection refused
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 229, in perform_request
        method, url, body, retries=Retry(False), headers=request_headers, **kw
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 720, in urlopen
        method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 376, in increment
        raise six.reraise(type(error), error, _stacktrace)
      File "/usr/local/lib/python3.7/site-packages/urllib3/packages/six.py", line 735, in reraise
        raise value
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
        chunked=chunked,
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
        conn.request(method, url, **httplib_request_kw)
      File "/usr/local/lib/python3.7/http/client.py", line 1244, in request
        self._send_request(method, url, body, headers, encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1290, in _send_request
        self.endheaders(body, encode_chunked=encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1239, in endheaders
        self._send_output(message_body, encode_chunked=encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1026, in _send_output
        self.send(msg)
      File "/usr/local/lib/python3.7/http/client.py", line 966, in send
        self.connect()
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 184, in connect
        conn = self._new_conn()
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 169, in _new_conn
        self, "Failed to establish a new connection: %s" % e
    urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f66d3bb9b10>: Failed to establish a new connection: [Errno 111] Connection refused
    WARNING:  PUT http://localhost:9200/document [status:N/A request:0.002s]
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 157, in _new_conn
        (self._dns_host, self.port), self.timeout, **extra_kw
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 84, in create_connection
        raise err
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 74, in create_connection
        sock.connect(sa)
    ConnectionRefusedError: [Errno 111] Connection refused
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 229, in perform_request
        method, url, body, retries=Retry(False), headers=request_headers, **kw
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 720, in urlopen
        method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 376, in increment
        raise six.reraise(type(error), error, _stacktrace)
      File "/usr/local/lib/python3.7/site-packages/urllib3/packages/six.py", line 735, in reraise
        raise value
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
        chunked=chunked,
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
        conn.request(method, url, **httplib_request_kw)
      File "/usr/local/lib/python3.7/http/client.py", line 1244, in request
        self._send_request(method, url, body, headers, encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1290, in _send_request
        self.endheaders(body, encode_chunked=encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1239, in endheaders
        self._send_output(message_body, encode_chunked=encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1026, in _send_output
        self.send(msg)
      File "/usr/local/lib/python3.7/http/client.py", line 966, in send
        self.connect()
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 184, in connect
        conn = self._new_conn()
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 169, in _new_conn
        self, "Failed to establish a new connection: %s" % e
    urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f66d3bb9d50>: Failed to establish a new connection: [Errno 111] Connection refused
    WARNING:  PUT http://localhost:9200/document [status:N/A request:0.002s]
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 157, in _new_conn
        (self._dns_host, self.port), self.timeout, **extra_kw
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 84, in create_connection
        raise err
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 74, in create_connection
        sock.connect(sa)
    ConnectionRefusedError: [Errno 111] Connection refused
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 229, in perform_request
        method, url, body, retries=Retry(False), headers=request_headers, **kw
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 720, in urlopen
        method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 376, in increment
        raise six.reraise(type(error), error, _stacktrace)
      File "/usr/local/lib/python3.7/site-packages/urllib3/packages/six.py", line 735, in reraise
        raise value
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
        chunked=chunked,
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
        conn.request(method, url, **httplib_request_kw)
      File "/usr/local/lib/python3.7/http/client.py", line 1244, in request
        self._send_request(method, url, body, headers, encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1290, in _send_request
        self.endheaders(body, encode_chunked=encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1239, in endheaders
        self._send_output(message_body, encode_chunked=encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1026, in _send_output
        self.send(msg)
      File "/usr/local/lib/python3.7/http/client.py", line 966, in send
        self.connect()
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 184, in connect
        conn = self._new_conn()
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 169, in _new_conn
        self, "Failed to establish a new connection: %s" % e
    urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f66d3eb08d0>: Failed to establish a new connection: [Errno 111] Connection refused
    WARNING:  PUT http://localhost:9200/document [status:N/A request:0.001s]
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 157, in _new_conn
        (self._dns_host, self.port), self.timeout, **extra_kw
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 84, in create_connection
        raise err
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 74, in create_connection
        sock.connect(sa)
    ConnectionRefusedError: [Errno 111] Connection refused
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 229, in perform_request
        method, url, body, retries=Retry(False), headers=request_headers, **kw
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 720, in urlopen
        method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 376, in increment
        raise six.reraise(type(error), error, _stacktrace)
      File "/usr/local/lib/python3.7/site-packages/urllib3/packages/six.py", line 735, in reraise
        raise value
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
        chunked=chunked,
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
        conn.request(method, url, **httplib_request_kw)
      File "/usr/local/lib/python3.7/http/client.py", line 1244, in request
        self._send_request(method, url, body, headers, encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1290, in _send_request
        self.endheaders(body, encode_chunked=encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1239, in endheaders
        self._send_output(message_body, encode_chunked=encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1026, in _send_output
        self.send(msg)
      File "/usr/local/lib/python3.7/http/client.py", line 966, in send
        self.connect()
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 184, in connect
        conn = self._new_conn()
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 169, in _new_conn
        self, "Failed to establish a new connection: %s" % e
    urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f66d3bb9cd0>: Failed to establish a new connection: [Errno 111] Connection refused
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 157, in _new_conn
        (self._dns_host, self.port), self.timeout, **extra_kw
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 84, in create_connection
        raise err
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 74, in create_connection
        sock.connect(sa)
    ConnectionRefusedError: [Errno 111] Connection refused
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 229, in perform_request
        method, url, body, retries=Retry(False), headers=request_headers, **kw
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 720, in urlopen
        method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
      File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 376, in increment
        raise six.reraise(type(error), error, _stacktrace)
      File "/usr/local/lib/python3.7/site-packages/urllib3/packages/six.py", line 735, in reraise
        raise value
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
        chunked=chunked,
      File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
        conn.request(method, url, **httplib_request_kw)
      File "/usr/local/lib/python3.7/http/client.py", line 1244, in request
        self._send_request(method, url, body, headers, encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1290, in _send_request
        self.endheaders(body, encode_chunked=encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1239, in endheaders
        self._send_output(message_body, encode_chunked=encode_chunked)
      File "/usr/local/lib/python3.7/http/client.py", line 1026, in _send_output
        self.send(msg)
      File "/usr/local/lib/python3.7/http/client.py", line 966, in send
        self.connect()
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 184, in connect
        conn = self._new_conn()
      File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 169, in _new_conn
        self, "Failed to establish a new connection: %s" % e
    urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f66d3bb9cd0>: Failed to establish a new connection: [Errno 111] Connection refused
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/bin/uvicorn", line 8, in <module>
        sys.exit(main())
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
        return self.main(*args, **kwargs)
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main
        rv = self.invoke(ctx)
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
        return callback(*args, **kwargs)
      File "/usr/local/lib/python3.7/site-packages/uvicorn/main.py", line 331, in main
        run(**kwargs)
      File "/usr/local/lib/python3.7/site-packages/uvicorn/main.py", line 354, in run
        server.run()
      File "/usr/local/lib/python3.7/site-packages/uvicorn/main.py", line 382, in run
        loop.run_until_complete(self.serve(sockets=sockets))
      File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
      File "/usr/local/lib/python3.7/site-packages/uvicorn/main.py", line 389, in serve
        config.load()
      File "/usr/local/lib/python3.7/site-packages/uvicorn/config.py", line 288, in load
        self.loaded_app = import_from_string(self.app)
      File "/usr/local/lib/python3.7/site-packages/uvicorn/importer.py", line 20, in import_from_string
        module = importlib.import_module(module_str)
      File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
      File "<frozen importlib._bootstrap>", line 983, in _find_and_load
      File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 728, in exec_module
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "./backend/api.py", line 11, in <module>
        from backend.controller.router import router as api_router
      File "./backend/controller/router.py", line 3, in <module>
        from backend.controller import autocomplete, model, feedback
      File "./backend/controller/model.py", line 60, in <module>
        excluded_meta_data=EXCLUDE_META_DATA_FIELDS,
      File "/home/user/src/farm-haystack/haystack/database/elasticsearch.py", line 48, in __init__
        self.client.indices.create(index=index, ignore=400, body=custom_mapping)
      File "/usr/local/lib/python3.7/site-packages/elasticsearch/client/utils.py", line 92, in _wrapped
        return func(*args, params=params, headers=headers, **kwargs)
      File "/usr/local/lib/python3.7/site-packages/elasticsearch/client/indices.py", line 104, in create
        "PUT", _make_path(index), params=params, headers=headers, body=body
      File "/usr/local/lib/python3.7/site-packages/elasticsearch/transport.py", line 362, in perform_request
        timeout=timeout,
      File "/usr/local/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 241, in perform_request
        raise ConnectionError("N/A", str(e), e)
    elasticsearch.exceptions.ConnectionError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7f66d3bb9cd0>: Failed to establish a new connection: [Errno 111] Connection refused) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7f66d3bb9cd0>: Failed to establish a new connection: [Errno 111] Connection refused)```
    opened by ghost 6
  • Model 2 Issue

    Model 2 Issue

    While using model 2, the API returns answer for almost everything but not in English. I belive model 2 should only return English answers.

    Try Define gravity?

    opened by theapache64 5
  • Create english evaluation dataset for question similarity

    Create english evaluation dataset for question similarity

    We should create a simple, evaluation dataset that can be used to benchmark our models for matching similar questions.

    What should be sufficient for a rough baseline:

    • 100-300 question pairs of similar questions
    • extending that with 50% false pairs
    enhancement NLP / Modeling 
    opened by tholor 4
  • Add Folkhälsomyndigheten as a source with data in Swedish and English

    Add Folkhälsomyndigheten as a source with data in Swedish and English

    Here's two new sources from the Swedish government agency for public health. Hopefully I can be more helpful when I get more familiar. I'm very interested in haystack and this is a great project for it. Im very interested in helping with getting this running on the Covid research dataset for healthcare professionals.

    https://www.folkhalsomyndigheten.se/the-public-health-agency-of-sweden/communicable-disease-control/covid-19/

    https://www.folkhalsomyndigheten.se/smittskydd-beredskap/utbrott/aktuella-utbrott/covid-19/fragor-och-svar/

    opened by ViktorAlm 3
  • add BMG scraper

    add BMG scraper

    very nice Source from the Bundesamt fuer Gesundheit with 106 question answer pairs.

    We have to decide if we want to include: https://www.zusammengegencorona.de/informieren/wirtschaftliche-folgen/ https://www.zusammengegencorona.de/informieren/weitere-informationen/ these sites answer only very specific questions or link to other sites.

    opened by Runinho 3
  • Train BERT on Quora Question Pairs Dataset

    Train BERT on Quora Question Pairs Dataset

    Using Sentence Transformers (https://github.com/UKPLab/sentence-transformers) I will start training a model on the Quora Question Pairs Dataset (https://www.kaggle.com/c/quora-question-pairs) that can classify duplicate question pairs.

    NLP / Modeling 
    opened by brandenchan 3
  • Fine-tune BERT on CORD-2019 dataset

    Fine-tune BERT on CORD-2019 dataset

    Fine-tune BERT (or word embeddings?) on CORD-2019 dataset published on Kaggle:

    [CORD-19 is a resource of over 29,000 scholarly articles, including over 13,000 with full text, about COVID-19, SARS-CoV-2, and related coronaviruses.]

    The dataset has 2GB. I guess the domain is quite different from the FAQ, as the dataset is made up of scientific papers, but could still be valuable to introduce some substantial vocabulary related to the virus.

    NLP / Modeling 
    opened by andra-pumnea 3
  • Where do I get the document subset of Cord-19 used for covid-qa

    Where do I get the document subset of Cord-19 used for covid-qa

    The paper mentions "We selected 147 scientific articles mostly related to COVID-19 from the CORD-19" . How can I get the subset of documents to create an index ?

    opened by jdpsen 1
  • Bump scrapy from 2.0.1 to 2.6.2

    Bump scrapy from 2.0.1 to 2.6.2

    Bumps scrapy from 2.0.1 to 2.6.2.

    Release notes

    Sourced from scrapy's releases.

    2.6.2

    Fixes a security issue around HTTP proxy usage, and addresses a few regressions introduced in Scrapy 2.6.0.

    See the changelog.

    2.6.1

    Fixes a regression introduced in 2.6.0 that would unset the request method when following redirects.

    2.6.0

    • Security fixes for cookie handling (see details below)
    • Python 3.10 support
    • asyncio support is no longer considered experimental, and works out-of-the-box on Windows regardless of your Python version
    • Feed exports now support pathlib.Path output paths and per-feed item filtering and post-processing

    See the full changelog

    Security bug fixes

    • When a Request object with cookies defined gets a redirect response causing a new Request object to be scheduled, the cookies defined in the original Request object are no longer copied into the new Request object.

      If you manually set the Cookie header on a Request object and the domain name of the redirect URL is not an exact match for the domain of the URL of the original Request object, your Cookie header is now dropped from the new Request object.

      The old behavior could be exploited by an attacker to gain access to your cookies. Please, see the cjvr-mfj7-j4j8 security advisory for more information.

      Note: It is still possible to enable the sharing of cookies between different domains with a shared domain suffix (e.g. example.com and any subdomain) by defining the shared domain suffix (e.g. example.com) as the cookie domain when defining your cookies. See the documentation of the Request class for more information.

    • When the domain of a cookie, either received in the Set-Cookie header of a response or defined in a Request object, is set to a public suffix <https://publicsuffix.org/>_, the cookie is now ignored unless the cookie domain is the same as the request domain.

      The old behavior could be exploited by an attacker to inject cookies from a controlled domain into your cookiejar that could be sent to other domains not controlled by the attacker. Please, see the mfjm-vh54-3f96 security advisory for more information.

    2.5.1

    Security bug fix:

    If you use HttpAuthMiddleware (i.e. the http_user and http_pass spider attributes) for HTTP authentication, any request exposes your credentials to the request target.

    To prevent unintended exposure of authentication credentials to unintended domains, you must now additionally set a new, additional spider attribute, http_auth_domain, and point it to the specific domain to which the authentication credentials must be sent.

    If the http_auth_domain spider attribute is not set, the domain of the first request will be considered the HTTP authentication target, and authentication credentials will only be sent in requests targeting that domain.

    If you need to send the same HTTP authentication credentials to multiple domains, you can use w3lib.http.basic_auth_header instead to set the value of the Authorization header of your requests.

    If you really want your spider to send the same HTTP authentication credentials to any domain, set the http_auth_domain spider attribute to None.

    Finally, if you are a user of scrapy-splash, know that this version of Scrapy breaks compatibility with scrapy-splash 0.7.2 and earlier. You will need to upgrade scrapy-splash to a greater version for it to continue to work.

    2.5.0

    • Official Python 3.9 support
    • Experimental HTTP/2 support
    • New get_retry_request() function to retry requests from spider callbacks

    ... (truncated)

    Changelog

    Sourced from scrapy's changelog.

    Scrapy 2.6.2 (2022-07-25)

    Security bug fix:

    • When :class:~scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware processes a request with :reqmeta:proxy metadata, and that :reqmeta:proxy metadata includes proxy credentials, :class:~scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware sets the Proxy-Authentication header, but only if that header is not already set.

      There are third-party proxy-rotation downloader middlewares that set different :reqmeta:proxy metadata every time they process a request.

      Because of request retries and redirects, the same request can be processed by downloader middlewares more than once, including both :class:~scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware and any third-party proxy-rotation downloader middleware.

      These third-party proxy-rotation downloader middlewares could change the :reqmeta:proxy metadata of a request to a new value, but fail to remove the Proxy-Authentication header from the previous value of the :reqmeta:proxy metadata, causing the credentials of one proxy to be sent to a different proxy.

      To prevent the unintended leaking of proxy credentials, the behavior of :class:~scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware is now as follows when processing a request:

      • If the request being processed defines :reqmeta:proxy metadata that includes credentials, the Proxy-Authorization header is always updated to feature those credentials.

      • If the request being processed defines :reqmeta:proxy metadata without credentials, the Proxy-Authorization header is removed unless it was originally defined for the same proxy URL.

        To remove proxy credentials while keeping the same proxy URL, remove the Proxy-Authorization header.

      • If the request has no :reqmeta:proxy metadata, or that metadata is a falsy value (e.g. None), the Proxy-Authorization header is removed.

        It is no longer possible to set a proxy URL through the :reqmeta:proxy metadata but set the credentials through the Proxy-Authorization header. Set proxy credentials through the :reqmeta:proxy metadata instead.

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies python 
    opened by dependabot[bot] 0
  • Bump node-sass from 4.13.1 to 7.0.0 in /covid-frontend

    Bump node-sass from 4.13.1 to 7.0.0 in /covid-frontend

    Bumps node-sass from 4.13.1 to 7.0.0.

    Release notes

    Sourced from node-sass's releases.

    v7.0.0

    Breaking changes

    Features

    Dependencies

    Community

    Misc

    Supported Environments

    OS Architecture Node
    Windows x86 & x64 12, 14, 16, 17
    OSX x64 12, 14, 16, 17
    Linux* x64 12, 14, 16, 17
    Alpine Linux x64 12, 14, 16, 17
    FreeBSD i386 amd64 12, 14

    *Linux support refers to major distributions like Ubuntu, and Debian

    v6.0.1

    Dependencies

    Misc

    Supported Environments

    ... (truncated)

    Changelog

    Sourced from node-sass's changelog.

    v4.14.0

    https://github.com/sass/node-sass/releases/tag/v4.14.0

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies javascript 
    opened by dependabot[bot] 0
  • Preprocessing of context to fit max_length

    Preprocessing of context to fit max_length

    Hi, would you please help me understand how the preprocessing is done for theCovidQA corpus ? Why I ask is because the context in the CovidQA dataset seems to be so much larger than the maximum length set in the code (which is 300+ and BERT max_length is 512 tokens). How is the data processed to fit into the limit ? Couldn't find the code for that in the Git. Please advice. Thank you.

    opened by Geethi2020 1
  • Applied iterator pattern to Response

    Applied iterator pattern to Response

    Implemented the iterator pattern on classes Response and ResponseToIndividualQuestion.

    This makes it easier to traverse the results collection in the Response and get access to the results in a sequential manner without any need to know its underlying representation.

    opened by yasserelsaid 0
Owner
deepset
Building enterprise search systems powered by latest NLP & open-source.
deepset
Backend, modern REST API for obtaining match and odds data crawled from multiple sites. Using FastAPI, MongoDB as database, Motor as async MongoDB client, Scrapy as crawler and Docker.

Introduction Apiestas is a project composed of a backend powered by the awesome framework FastAPI and a crawler powered by Scrapy. This project has fo

Fran Lozano 54 Dec 13, 2022
Deploy an inference API on AWS (EC2) using FastAPI Docker and Github Actions

Deploy an inference API on AWS (EC2) using FastAPI Docker and Github Actions To learn more about this project: medium blog post The goal of this proje

Ahmed BESBES 60 Dec 17, 2022
A Nepali Dictionary API made using FastAPI.

Nepali Dictionary API A Nepali dictionary api created using Fast API and inspired from https://github.com/nirooj56/Nepdict. You can say this is just t

Nishant Sapkota 4 Mar 18, 2022
First API using FastApi

First API using FastApi Made this Simple Api to store and Retrive Student Data of My College Ncc-Bim To View All the endpoits Visit /docs To Run Local

Sameer Joshi 2 Jun 21, 2022
signal-cli-rest-api is a wrapper around signal-cli and allows you to interact with it through http requests

signal-cli-rest-api signal-cli-rest-api is a wrapper around signal-cli and allows you to interact with it through http requests. Features register/ver

Sebastian Noel LĂĽbke 31 Dec 9, 2022
REST API with FastAPI and SQLite3.

REST API with FastAPI and SQLite3

Luis Quiñones Requelme 2 Mar 14, 2022
api versioning for fastapi web applications

fastapi-versioning api versioning for fastapi web applications Installation pip install fastapi-versioning Examples from fastapi import FastAPI from f

Dean Way 472 Jan 2, 2023
Publish Xarray Datasets via a REST API.

Xpublish Publish Xarray Datasets via a REST API. Serverside: Publish a Xarray Dataset through a rest API ds.rest.serve(host="0.0.0.0", port=9000) Clie

xarray-contrib 106 Jan 6, 2023
This is a FastAPI application that provides a RESTful API for the Podcasts from different podcast's RSS feeds

The Podcaster API This is a FastAPI application that provides a RESTful API for the Podcasts from different podcast's RSS feeds. The API response is i

Sagar Giri 2 Nov 7, 2021
Twitter API with fastAPI

Twitter API with fastAPI Content Forms Cookies and headers management Files edition Status codes HTTPExceptions Docstrings or documentation Deprecate

Juan Agustin Di Pasquo 1 Dec 21, 2021
Mnist API server w/ FastAPI

Mnist API server w/ FastAPI

Jinwoo Park (Curt) 8 Feb 8, 2022
Social Distancing Detector using deep learning and capable to run on edge AI devices such as NVIDIA Jetson, Google Coral, and more.

Smart Social Distancing Smart Social Distancing Introduction Getting Started Prerequisites Usage Processor Optional Parameters Configuring AWS credent

Neuralet 129 Dec 12, 2022
Example app using FastAPI and JWT

FastAPI-Auth Example app using FastAPI and JWT virtualenv -p python3 venv source venv/bin/activate pip3 install -r requirements.txt mv config.yaml.exa

Sander 28 Oct 25, 2022
Web Inventory tool, takes screenshots of webpages using Pyppeteer (headless Chrome/Chromium) and provides some extra bells & whistles to make life easier.

WitnessMe WitnessMe is primarily a Web Inventory tool inspired by Eyewitness, its also written to be extensible allowing you to create custom function

byt3bl33d3r 648 Jan 5, 2023
Backend Skeleton using FastAPI and Sqlalchemy ORM

Backend API Skeleton Based on @tiangolo's full stack postgres template, with some things added, some things removed, and some things changed. This is

David Montague 18 Oct 31, 2022
Example of using FastAPI and MongoDB database.

FastAPI Todo Application Example of using FastAPI and MangoDB database. ?? Prerequisites Python ⚙️ Build & Run The first thing to do is to clone the r

Bobynets Ivan 1 Oct 29, 2021
Monitor Python applications using Spring Boot Admin

Pyctuator Monitor Python web apps using Spring Boot Admin. Pyctuator supports Flask, FastAPI, aiohttp and Tornado. Django support is planned as well.

SolarEdge Technologies 145 Dec 28, 2022
A complete end-to-end machine learning portal that covers processes starting from model training to the model predicting results using FastAPI.

Machine Learning Portal Goal Application Workflow Process Design Live Project Goal A complete end-to-end machine learning portal that covers processes

Shreyas K 39 Nov 24, 2022
Htmdf - html to pdf with support for variables using fastApi.

htmdf Converts html to pdf with support for variables using fastApi. Installation Clone this repository. git clone https://github.com/ShreehariVaasish

Shreehari 1 Jan 30, 2022