Unicode crash during get_files
Reported by Virgil Dupras | January 14th, 2010 @ 01:10 PM
Application Identifier: com.hardcoded_software.dupeguru
Application Version: 2.9.0
Traceback (most recent call last):
File "hsutil/cocoa.pyc", line 47, in _async_run
File "dupeguru/app.pyc", line 229, in do
File "dupeguru/directories.pyc", line 111, in get_files
File "dupeguru/directories.pyc", line 75, in _get_files
File "dupeguru/directories.pyc", line 75, in _get_files
File "dupeguru/directories.pyc", line 67, in _get_files
File "dupeguru/fs.pyc", line 161, in get_files
File "hsutil/path.pyc", line 70, in __add__
File "hsutil/path.pyc", line 61, in __new__
File "hsutil/path.pyc", line 39, in unicode_if_needed
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 5-7: invalid data
Relevant Console logs:
Jan 13 11:32:40 mamabear [0x0-0x79079].com.hardcoded_software.dupeguru[1425]: WARNING There was an error trying to detect the UTI of /Volumes/Untitled/archive/documents/audio/Schoeps Mikrofonaufsätze
Jan 13 11:33:02 mamabear [0x0-0x79079].com.hardcoded_software.dupeguru[1425]: WARNING There was an error trying to detect the UTI of /Volumes/Untitled/archive/images/2005-01-05 Dreh Fabi Tübingen
Jan 13 11:33:02 mamabear [0x0-0x79079].com.hardcoded_software.dupeguru[1425]: WARNING There was an error trying to detect the UTI of /Volumes/Untitled/archive/images/2004-09-27 (Fabi) Tobias Müller Geb (21)
Jan 13 11:33:02 mamabear [0x0-0x79079].com.hardcoded_software.dupeguru[1425]: WARNING There was an error trying to detect the UTI of /Volumes/Untitled/archive/images/2005-01-02 Dreh Fabi Tübingen
Jan 13 11:33:03 mamabear [0x0-0x79079].com.hardcoded_software.dupeguru[1425]: WARNING Could not decode 'Wiebk\xe4\xe4\xe4.jpg'
Comments and changes to this ticket
-
Virgil Dupras February 4th, 2010 @ 06:37 PM
This seems to come from cases where network share are scanned. I've got to set one of these up.
-
Virgil Dupras February 5th, 2010 @ 02:34 PM
- State changed from new to accepted
I tried scanning a Windows share containing files with non-accented characters, and it worked fine. What is more puzzling is that there is no unicode decoding supposing to take place as strings are supposed to already be unicode.
I guess that the best course of action would be to ignore these error and log it. Damn, I can't figure out how it's possible for a non-unicode string to slip in this process!
-
Virgil Dupras February 5th, 2010 @ 05:56 PM
- State changed from accepted to hold
(from [412d21d8f683]) [#84 state:hold] Added debug logging to fs.get_files() to eventually figure out the cause of this bug. http://bitbucket.org/hsoft/dupeguru/changeset/412d21d8f683/
Please Sign in or create a free account to add a new ticket.
With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป
People watching this ticket
Tags
Referenced by
- 84 Unicode crash during get_files (from [412d21d8f683]) [#84 state:hold] Added debug loggin...