#170 ✓invalid
Virgil Dupras

Encoding crash on saving results

Reported by Virgil Dupras | August 1st, 2011 @ 05:11 PM

Application Name: dupeGuru Music Edition
Version: 6.0.1

Traceback (most recent call last):
 File "/usr/local/share/dupeguru_me/qt/base/result_window.py", line 338, in saveResultsTriggered
 File "/usr/local/share/dupeguru_me/core/app.py", line 366, in save_as
 File "/usr/local/share/dupeguru_me/core/results.py", line 308, in save_to_xml
   tree.write(fp, encoding='utf-8')
 File "/usr/lib/python3.1/xml/etree/ElementTree.py", line 659, in write
   self._write(file, self._root, encoding, {})
 File "/usr/lib/python3.1/xml/etree/ElementTree.py", line 701, in _write
   self._write(file, n, encoding, namespaces)
 File "/usr/lib/python3.1/xml/etree/ElementTree.py", line 701, in _write
   self._write(file, n, encoding, namespaces)
 File "/usr/lib/python3.1/xml/etree/ElementTree.py", line 693, in _write
   file.write(_encode(" %s=\"%s\"" % (k, _escape_attrib(v)), encoding))
 File "/usr/lib/python3.1/xml/etree/ElementTree.py", line 745, in _encode
   return s.encode(encoding)
UnicodeEncodeError: 'utf-8' codec can't encode character '\udc85' in position 127: surrogates not allowed

Comments and changes to this ticket

  • Virgil Dupras

    Virgil Dupras August 24th, 2011 @ 03:34 PM

    • State changed from “new” to “accepted”
    • Assigned user set to “Virgil Dupras”

    I can reproduce this bug by programatically renaming a file under Ubuntu with a name that has '\udc85' in it. Then, the file manager will display "(invalid encoding)" next to the name. Try to scan that and the save the results, you get this crash. We should try to encode pathnames before saving and then if it doesn't work, just don't save the file and emit a warning.

  • Virgil Dupras

    Virgil Dupras August 24th, 2011 @ 05:03 PM

    • State changed from “accepted” to “invalid”

    Oh well, after lots of messing around wondering why I can't reproduce the bug on my dev env, I discovered that etree in python 3.2 automatically works around this problem and replaces unencodable characters by xml entities. For example, dc85 will be replaced by �. I'm not fixing this as I'll soon move the linux version to ubuntu 11.04 (which has python 3.2) anyway.

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

Shared Ticket Bins

People watching this ticket


Referenced by