When cchardet-2.1.7 and chardet-5.0.0 are both installed, the following tests fail.
FWICS two of them fail because of encoding name mismatches (expected is mixed-case, the value is uppercase), and two of them are recognized as a superset-encoding of the specified encoding (i.e. EUC-KR as UHC, and GB2312 as GB18030).
...F...FF.F.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
======================================================================
FAIL: test_001742 (__main__.TestCase)
./tests/illformed/chardet/windows1255.xml: windows-1255 with no encoding information
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/feedparser/tests/runtests.py", line 1191, in fn
self.fail_unless_eval(xmlfile, eval_string)
File "/tmp/feedparser/tests/runtests.py", line 177, in fail_unless_eval
raise self.failureException(failure)
AssertionError: not eval(b"bozo and encoding == 'windows-1255'")
WITH env({'bozo': True,
'bozo_exception': CharacterEncodingOverride('document declared as utf-8, but parsed as WINDOWS-1255'),
'content-type': '',
'encoding': 'WINDOWS-1255',
'entries': [{'summary': 'האם תדפיס נייר של אתר אינטרנט שמוצג על מסך משתמש הוא '
'העתק נאמן למקור של אתר האינטרנט? רבים יגידו שכן, '
'ולפעמים גם בתי המשפט יצטרפו אליהם שיקבלו פלט מאתר '
'אינטרנט כראיה קבילה. אבל, זה ממש לא כך. ויש אפילו '
'הוכחה מדהימה.',
'summary_detail': {'base': '',
'language': None,
'type': 'text/html',
'value': 'האם תדפיס נייר של אתר אינטרנט שמוצג '
'על מסך משתמש הוא העתק נאמן למקור של '
'אתר האינטרנט? רבים יגידו שכן, '
'ולפעמים גם בתי המשפט יצטרפו אליהם '
'שיקבלו פלט מאתר אינטרנט כראיה '
'קבילה. אבל, זה ממש לא כך. ויש אפילו '
'הוכחה מדהימה.'}}],
'feed': {},
'headers': {},
'namespaces': {},
'version': 'rss'})
======================================================================
FAIL: test_001746 (__main__.TestCase)
./tests/illformed/chardet/gb2312.xml: GB2312 with no encoding information
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/feedparser/tests/runtests.py", line 1191, in fn
self.fail_unless_eval(xmlfile, eval_string)
File "/tmp/feedparser/tests/runtests.py", line 177, in fail_unless_eval
raise self.failureException(failure)
AssertionError: not eval(b"bozo and encoding == 'GB2312'")
WITH env({'bozo': True,
'bozo_exception': CharacterEncodingOverride('document declared as utf-8, but parsed as GB18030'),
'content-type': '',
'encoding': 'GB18030',
'entries': [{'title': '不归移民漫画系列:专业工作',
'title_detail': {'base': '',
'language': None,
'type': 'text/plain',
'value': '不归移民漫画系列:专业工作'}}],
'feed': {},
'headers': {},
'namespaces': {},
'version': 'rss'})
======================================================================
FAIL: test_001747 (__main__.TestCase)
./tests/illformed/chardet/euckr.xml: EUC-KR with no encoding information
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/feedparser/tests/runtests.py", line 1191, in fn
self.fail_unless_eval(xmlfile, eval_string)
File "/tmp/feedparser/tests/runtests.py", line 177, in fail_unless_eval
raise self.failureException(failure)
AssertionError: not eval(b"bozo and encoding == 'EUC-KR'")
WITH env({'bozo': True,
'bozo_exception': CharacterEncodingOverride('document declared as utf-8, but parsed as UHC'),
'content-type': '',
'encoding': 'UHC',
'entries': [{'summary': 'TypeKey 시스템이 UTF-8로 돌아가는데, 거기서 한글로 된 닉네임을 정할 경우에, '
'EUC-KR로 된 무버블타입 블록에선 리다이렉트되어 전송되어오는 닉네임이 UTF라 당연히 '
'깨어져 나타난다. 실제 블록 등에서 사용하는 필명 내지는 닉네임은 한글로 사용하는 많은 분들도 '
'타입키에서의 닉네임은 이런 문제때문에 울며겨자먹기로 영어로 짓고 있다....',
'summary_detail': {'base': '',
'language': None,
'type': 'text/html',
'value': 'TypeKey 시스템이 UTF-8로 돌아가는데, 거기서 한글로 '
'된 닉네임을 정할 경우에, EUC-KR로 된 무버블타입 블록에선 '
'리다이렉트되어 전송되어오는 닉네임이 UTF라 당연히 깨어져 '
'나타난다. 실제 블록 등에서 사용하는 필명 내지는 닉네임은 '
'한글로 사용하는 많은 분들도 타입키에서의 닉네임은 이런 '
'문제때문에 울며겨자먹기로 영어로 짓고 있다....'},
'title': 'EUC-KR 에서 TypeKey 한글닉네임 표시하기',
'title_detail': {'base': '',
'language': None,
'type': 'text/plain',
'value': 'EUC-KR 에서 TypeKey 한글닉네임 표시하기'}}],
'feed': {},
'headers': {},
'namespaces': {},
'version': 'rss'})
======================================================================
FAIL: test_001749 (__main__.TestCase)
./tests/illformed/chardet/big5.xml: Big5 with no encoding information
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/feedparser/tests/runtests.py", line 1191, in fn
self.fail_unless_eval(xmlfile, eval_string)
File "/tmp/feedparser/tests/runtests.py", line 177, in fail_unless_eval
raise self.failureException(failure)
AssertionError: not eval(b"bozo and encoding == 'Big5'")
WITH env({'bozo': True,
'bozo_exception': CharacterEncodingOverride('document declared as utf-8, but parsed as BIG5'),
'content-type': '',
'encoding': 'BIG5',
'entries': [],
'feed': {'title': '我希望??很容易?其翻?成中文,并有助于改??件。 感?您??本文。',
'title_detail': {'base': '',
'language': None,
'type': 'text/plain',
'value': '我希望??很容易?其翻?成中文,并有助于改??件。 感?您??本文。'}},
'headers': {},
'namespaces': {'': 'http://www.w3.org/2005/Atom'},
'version': 'atom10'})
----------------------------------------------------------------------
Ran 4354 tests in 4.892s
FAILED (failures=4)