- What versions are you using?
- Oracle database Version 19.4.0.0.0
- Oracle instantclient version: 21.3.0.0.0
- OS version: RHEL 7.9 x64
>>> print("platform.platform:", platform.platform())
platform.platform: Linux-3.10.0-1160.36.2.el7.x86_64-x86_64-with-glibc2.17
>>> print("sys.maxsize > 2**32:", sys.maxsize > 2**32)
sys.maxsize > 2**32: True
>>> print("platform.python_version:", platform.python_version())
platform.python_version: 3.9.5
>>> print("cx_Oracle.version:", cx_Oracle.version)
cx_Oracle.version: 8.2.1
>>> print("cx_Oracle.clientversion:", cx_Oracle.clientversion())
cx_Oracle.clientversion: (21, 3, 0, 0, 0)
- Describe the problem
First please take a look at the code snippet in (3).
The issue is, the code can run, just it uses a LOT of memory. My guesstimate of the final resultant rows
object should be no more than 2 GB in size but the script uses up all my 30 GB memory and gets terminated by OS (i.e., I don't know exactly how much memory the script would like to allocate since it is not given the chance). Before the script gets killed, it asks for about 200 MBytes of memory each second so my 30 GB memory does not last for too long...At the moment it is killed, usually it fetches no more than 50% of the rows needed.
Here comes the really strange thing--I can avoid the above issue by commenting out rows.append(row)
. That is, if I don't save the fetched rows and just keeps reading and discarding, the script becomes totally fine...
One may say: would it be possible that there is a mismatch between rows' actual size and cx_Oracle
's idea of their size? This is unlikely, I tried printing cursor.description
and the result is the following: [('Date', <class 'cx_Oracle.DATETIME'>, 23, None, None, None, 0), ('ID', <class 'cx_Oracle.STRING'>, 9, 9, None, None, 0), ('KeyName', <class 'cx_Oracle.STRING'>, 53, 53, None, None, 0), ('ValueName', <class 'cx_Oracle.NUMBER'>, 127, None, 126, -127, 0)]
. Looks normal.
- Include a runnable Python script that shows the problem.
dsn = cx_Oracle.makedsn(host=[host], port=[port], sid=[sid])
con = cx_Oracle.connect(user=[username], password=[password], dsn=dsn)
rows = []
with con.cursor() as cursor:
cursor.arraysize = 100_000 # Only effective when using cx_Oracle
sql = """
SELECT "Date", "ID", "KeyName", "ValueName"
FROM "A_Table"
WHERE ("A_Table"."Date" >= '07-SEP-20' AND
"A_Table"."Date" <= '07-SEP-21')
"""
# So I just to get everything from the last year
cursor.execute(sql)
while True:
fetched_rows = cur.fetchmany(num_rows)
if not fetched_rows:
break
rows.extend(fetched_rows)
- Extra Attempts
What if I do this instead:
while True:
fetched_rows = cur.fetchmany(num_rows)
if not fetched_rows:
break
dcfr = copy.deepcopy(fetch_rows)
fetch_rows = None # we keep a deepcopy of fetched_rows, hoping it can release the object reference and space.
rows.extend(dcfr)
gc.collect() # Manual garbage collection to avoid any delay
No, it won't solve the issue. I also tried using SQL*PLUS: https://dba.stackexchange.com/questions/299426/the-expected-read-speed-of-sqlplus .It does NOT suffer from the same issue and it can reach a local read speed at 50MBytes/s.
Also, I am suspecting this is not a cx_Oracle
bug since pyodbc
suffers from the same issue. Can check out my question on StackExchange (mostly the same as this one): https://dba.stackexchange.com/questions/299444/possible-memory-leak-of-cx-oracle-pyodbc-oracle-instant-client
I canNOT rule out the possibility that it is some machine-specific settings that are wrong , feel free to ask me providing more info, such as content of sqlnet.ora
, etc.
Any idea?...
question