cover the full range of Unicode

author David Lawrence Ramsey <pooka109@gmail.com>

Mon, 8 Aug 2005 23:47:28 +0000 (23:47 +0000)

committer David Lawrence Ramsey <pooka109@gmail.com>

Mon, 8 Aug 2005 23:47:28 +0000 (23:47 +0000)
author David Lawrence Ramsey <pooka109@gmail.com>
Mon, 8 Aug 2005 23:47:28 +0000 (23:47 +0000)
committer David Lawrence Ramsey <pooka109@gmail.com>
Mon, 8 Aug 2005 23:47:28 +0000 (23:47 +0000)
diff --git a/ChangeLog b/ChangeLog

index 9ed41b141df672c60275816ac9f686e5bf77897c..4e337dc7d50b2750da995c306a5fa007e898541e 100644 (file)
--- a/ChangeLog
+++ b/ChangeLog
@@ -138,9 +138,10 @@ CVS code -
           invalid, since the C library's multibyte functions don't seem
           to.  New function is_valid_unicode(); changes to mbrep() and
           make_mbchar(). (DLR)
-       - Store Unicode values in longs instead of ints.  Changes to
-         make_mbchar(), parse_kbinput(), get_unicode_kbinput(), and
-         parse_verbatim_kbinput(). (DLR)
+       - Store Unicode values in longs instead of ints, and cover the
+         entire range of Unicode.  Changes to make_mbchar(),
+         is_valid_unicode(), parse_kbinput(), get_unicode_kbinput(),
+         parse_verbatim_kbinput(), and faq.html. (DLR)
  - color.c:
         - Remove unneeded fcntl.h include. (DLR)
  - chars.c:
diff --git a/doc/faq.html b/doc/faq.html

index e8d7fbbaa1fccd03b5fb78e8c91cf94e451e2a0e..99e229a43565b8a56fd3ca782f9e564e1983a60f 100644 (file)
--- a/doc/faq.html
+++ b/doc/faq.html
@@ -167,7 +167,7 @@
    <p>You can move between the buffers you have open with the <b>Meta-&lt;</b> and <b>Meta-&gt;</b> keys, or more easily with <b>Meta-,</b> and <b>Meta-.</b> (clear as mud, right? =-). When you have more than one file buffer open, the ^X shortcut will say &quot;Close&quot;, instead of the normal &quot;Exit&quot; when only one buffer is open.</p></blockquote>
  <h2><a name="3.8"></a>3.8. Tell me more about this verbatim input stuff!</h2>
  <blockquote><p>To use verbatim input, you must be using nano 1.3.1 or newer. When you want to insert a literal character into the file you're editing, such as a control character that nano usually treats as a command, first press <b>Meta-V</b>. (If you're not at a prompt, you'll get the message &quot;Verbatim Input&quot;.) Then press the key(s) that generate the character you want.</p>
-  <p>Alternatively, you can press <b>Meta-V</b> and then type a four-digit hexadecimal code from 0000 to ffff (case-insensitive), and the character with the corresponding value will be inserted instead.</p></blockquote>
+  <p>Alternatively, you can press <b>Meta-V</b> and then type a six-digit hexadecimal code from 000000 to 10FFFF (case-insensitive), and the character with the corresponding value will be inserted instead.</p></blockquote>
  <h2><a name="3.9"></a>3.9. How do I make a .nanorc file that nano will read when I start it?</h2>
  <blockquote><p>It's not hard at all! But, your version of nano must have been compiled with <b>--enable-nanorc</b>, and again must be version 1.1.12 or newer (use nano -V to check your version and compiled features). Then simply copy the <b>nanorc.sample</b> that came with the nano source or your nano package (most likely in /usr/doc/nano) to .nanorc in your home directory. If you didn't get one, the syntax is simple. Flags are turned on and off by using the word <b>set</b> and the getopt_long flag for the feature, for example &quot;set nowrap&quot; or &quot;set suspend&quot;.</p></blockquote>
  <hr width="100%">
@@ -250,6 +250,7 @@
  <h2><a name="8"></a>8. ChangeLog</h2>
  <blockquote>
  <p>
+2005/08/08 - Update section 3.8 to mention that verbatim input mode now takes a six-digit hexadecimal number. (DLR)<br>
  2005/07/04 - Update section 4.10 to mention that pasting from the X clipboard via the middle mouse button also works when the Shift key is used.<br>
  2005/06/15 - Update description of --enable-extra, and add missing line breaks. (DLR)<br>
  2005/06/13 - Minor capitalization and wording fixes. (DLR)<br>
diff --git a/src/chars.c b/src/chars.c

index c7f61038cfd2f9f268c96120f8c586986f1375b3..bb41f6000ae4af27270e4273b8b0e3433d992b62 100644 (file)
--- a/src/chars.c
+++ b/src/chars.c
@@ -888,8 +888,8 @@ bool has_blank_mbchars(const char *s)
   * ranges D800-DFFF or FFFE-FFFF), and FALSE otherwise. */
  bool is_valid_unicode(wchar_t wc)
  {
-    return (0 <= wc && (wc <= 0xD7FF || 0xE000 <= wc) && (wc !=
-       0xFFFE && wc != 0xFFFF));
+    return (0 <= wc && (wc <= 0xD7FF || 0xE000 <= wc) && (wc <=
+       0xFFFD || 0x10000 <= wc));
  }
  #endif
  
diff --git a/src/winio.c b/src/winio.c

index 3ecde58555547e3713438f7d7a34f5174f88e802..ade2f6eed90f1eaf26c3a7157f0a02749270bce1 100644 (file)
--- a/src/winio.c
+++ b/src/winio.c
@@ -1232,9 +1232,9 @@ int get_byte_kbinput(int kbinput
      return retval;
  }
  
-/* Translate a Unicode sequence: turn a four-digit hexadecimal number
- * from 0000 to FFFF (case-insensitive) into its corresponding multibyte
- * value. */
+/* Translate a Unicode sequence: turn a six-digit hexadecimal number
+ * from 000000 to 10FFFF (case-insensitive) into its corresponding
+ * multibyte value. */
  long get_unicode_kbinput(int kbinput
  #ifndef NANO_SMALL
         , bool reset
@@ -1253,15 +1253,41 @@ long get_unicode_kbinput(int kbinput
      }
  #endif
  
-    /* Increment the word digit counter. */
+    /* Increment the Unicode digit counter. */
      uni_digits++;
  
      switch (uni_digits) {
         case 1:
             /* One digit: reset the Unicode sequence holder and add the
-            * digit we got to the 0x1000's position of the Unicode
+            * digit we got to the 0x100000's position of the Unicode
              * sequence holder. */
             uni = 0;
+           if ('0' <= kbinput && kbinput <= '1')
+               uni += (kbinput - '0') * 0x100000;
+           else
+               /* If the character we got isn't a hexadecimal digit, or
+                * if it is and it would put the Unicode sequence out of
+                * valid range, save it as the result. */
+               retval = kbinput;
+           break;
+       case 2:
+           /* Two digits: add the digit we got to the 0x10000's
+            * position of the Unicode sequence holder. */
+           if ('0' == kbinput || (uni < 0x100000 && '1' <= kbinput &&
+               kbinput <= '9'))
+               uni += (kbinput - '0') * 0x10000;
+           else if (uni < 0x100000 && 'a' <= tolower(kbinput) &&
+               tolower(kbinput) <= 'f')
+               uni += (tolower(kbinput) + 10 - 'a') * 0x10000;
+           else
+               /* If the character we got isn't a hexadecimal digit, or
+                * if it is and it would put the Unicode sequence out of
+                * valid range, save it as the result. */
+               retval = kbinput;
+           break;
+       case 3:
+           /* Three digits: add the digit we got to the 0x1000's
+            * position of the Unicode sequence holder. */
             if ('0' <= kbinput && kbinput <= '9')
                 uni += (kbinput - '0') * 0x1000;
             else if ('a' <= tolower(kbinput) && tolower(kbinput) <= 'f')
@@ -1272,8 +1298,8 @@ long get_unicode_kbinput(int kbinput
                  * valid range, save it as the result. */
                 retval = kbinput;
             break;
-       case 2:
-           /* Two digits: add the digit we got to the 0x100's position
+       case 4:
+           /* Four digits: add the digit we got to the 0x100's position
              * of the Unicode sequence holder. */
             if ('0' <= kbinput && kbinput <= '9')
                 uni += (kbinput - '0') * 0x100;
@@ -1285,8 +1311,8 @@ long get_unicode_kbinput(int kbinput
                  * valid range, save it as the result. */
                 retval = kbinput;
             break;
-       case 3:
-           /* Three digits: add the digit we got to the 0x10's position
+       case 5:
+           /* Five digits: add the digit we got to the 0x10's position
              * of the Unicode sequence holder. */
             if ('0' <= kbinput && kbinput <= '9')
                 uni += (kbinput - '0') * 0x10;
@@ -1298,8 +1324,8 @@ long get_unicode_kbinput(int kbinput
                  * valid range, save it as the result. */
                 retval = kbinput;
             break;
-       case 4:
-           /* Four digits: add the digit we got to the 1's position of
+       case 6:
+           /* Six digits: add the digit we got to the 1's position of
              * the Unicode sequence holder, and save the corresponding
              * Unicode value as the result. */
             if ('0' <= kbinput && kbinput <= '9') {
@@ -1316,14 +1342,14 @@ long get_unicode_kbinput(int kbinput
                 retval = kbinput;
             break;
         default:
-           /* More than four digits: save the character we got as the
+           /* More than six digits: save the character we got as the
              * result. */
             retval = kbinput;
             break;
      }
  
-    /* If we have a result, reset the word digit counter and the word
-     * sequence holder. */
+    /* If we have a result, reset the Unicode digit counter and the
+     * Unicode sequence holder. */
      if (retval != ERR) {
         uni_digits = 0;
         uni = 0;
author	David Lawrence Ramsey <pooka109@gmail.com>
	Mon, 8 Aug 2005 23:47:28 +0000 (23:47 +0000)
committer	David Lawrence Ramsey <pooka109@gmail.com>
	Mon, 8 Aug 2005 23:47:28 +0000 (23:47 +0000)
ChangeLog		patch \| blob \| history
doc/faq.html		patch \| blob \| history
src/chars.c		patch \| blob \| history
src/winio.c		patch \| blob \| history