This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Patch not showing in patchwork


On Tue, Jun 10, 2014 at 08:45:09PM +0100, Pedro Alves wrote:
> On 06/10/2014 05:39 PM, Joseph S. Myers wrote:
> > I noticed that my patch 
> > <https://sourceware.org/ml/libc-alpha/2014-06/msg00077.html> does not 
> > appear in the list of patches in patchwork (it appears not to have a patch 
> > entry in patchwork at all, rather than having been mistakenly marked as 
> > committed).  Is there some way to tell why it didn't get recognised by 
> > patchwork?
> > 
> 
> I'll guess due to:
> 
>  Content-Type: text/plain; charset="X-UNKNOWN"

This was the problem.  I have pushed the patch manually:

http://patchwork.sourceware.org/patch/1430/

I have also applied the following patch to patchwork to allow it to
fall back on alternate encodings if the charset is unknown.  I'll post
it to the patchwork mailing list.

Siddhesh


--- a/apps/patchwork/bin/parsemail.py	2014-06-11 15:53:12.685666812 +0530
+++ b/apps/patchwork/bin/parsemail.py	2014-06-11 15:53:03.991667186 +0530
@@ -147,6 +147,13 @@
         return match.group(1)
     return None
 
+def try_decode(payload, charset):
+    try:
+        payload = unicode(payload, charset)
+    except UnicodeDecodeError:
+        return None
+    return payload
+
 def find_content(project, mail):
     patchbuf = None
     commentbuf = ''
@@ -157,15 +164,27 @@
             continue
 
         payload = part.get_payload(decode=True)
-        charset = part.get_content_charset()
         subtype = part.get_content_subtype()
 
-        # if we don't have a charset, assume utf-8
-        if charset is None:
-            charset = 'utf-8'
-
         if not isinstance(payload, unicode):
-            payload = unicode(payload, charset)
+            charset = part.get_content_charset()
+
+            # If there is no charset or if it is unknown, then try some common
+            # charsets before we fail.
+            if charset is None or charset == 'x-unknown':
+                try_charsets = ['utf-8', 'windows-1252', 'ascii', 'iso-8859']
+            else:
+                try_charsets = [charset]
+
+            for cset in try_charsets:
+                decoded_payload = try_decode(payload, cset)
+                if decoded_payload is not None:
+                    break
+            payload = decoded_payload
+
+            # Could not find a valid decoded payload.  Fail.
+            if payload is None:
+                return (None, None)
 
         if subtype in ['x-patch', 'x-diff']:
             patchbuf = payload

Attachment: pgpJTC6zQiP7y.pgp
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]