From: David Lawrence Ramsey <pooka109@gmail.com>
Date: Mon, 8 Aug 2005 23:47:28 +0000 (+0000)
Subject: cover the full range of Unicode
X-Git-Tag: v1.3.9~68
X-Git-Url: https://git.wh0rd.org/?a=commitdiff_plain;h=8c7a562394ee08f6e717c4afc0a8282febbe4dbd;p=nano.git

cover the full range of Unicode


git-svn-id: svn://svn.savannah.gnu.org/nano/trunk/nano@2978 35c25a1d-7b9e-4130-9fde-d3aeb78583b8
---

diff --git a/ChangeLog b/ChangeLog
index 9ed41b14..4e337dc7 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -138,9 +138,10 @@ CVS code -
 	  invalid, since the C library's multibyte functions don't seem
 	  to.  New function is_valid_unicode(); changes to mbrep() and
 	  make_mbchar(). (DLR)
-	- Store Unicode values in longs instead of ints.  Changes to
-	  make_mbchar(), parse_kbinput(), get_unicode_kbinput(), and
-	  parse_verbatim_kbinput(). (DLR)
+	- Store Unicode values in longs instead of ints, and cover the
+	  entire range of Unicode.  Changes to make_mbchar(),
+	  is_valid_unicode(), parse_kbinput(), get_unicode_kbinput(),
+	  parse_verbatim_kbinput(), and faq.html. (DLR)
 - color.c:
 	- Remove unneeded fcntl.h include. (DLR)
 - chars.c:
diff --git a/doc/faq.html b/doc/faq.html
index e8d7fbba..99e229a4 100644
--- a/doc/faq.html
+++ b/doc/faq.html
@@ -167,7 +167,7 @@
   <p>You can move between the buffers you have open with the <b>Meta-&lt;</b> and <b>Meta-&gt;</b> keys, or more easily with <b>Meta-,</b> and <b>Meta-.</b> (clear as mud, right? =-). When you have more than one file buffer open, the ^X shortcut will say &quot;Close&quot;, instead of the normal &quot;Exit&quot; when only one buffer is open.</p></blockquote>
 <h2><a name="3.8"></a>3.8. Tell me more about this verbatim input stuff!</h2>
 <blockquote><p>To use verbatim input, you must be using nano 1.3.1 or newer. When you want to insert a literal character into the file you're editing, such as a control character that nano usually treats as a command, first press <b>Meta-V</b>. (If you're not at a prompt, you'll get the message &quot;Verbatim Input&quot;.) Then press the key(s) that generate the character you want.</p>
-  <p>Alternatively, you can press <b>Meta-V</b> and then type a four-digit hexadecimal code from 0000 to ffff (case-insensitive), and the character with the corresponding value will be inserted instead.</p></blockquote>
+  <p>Alternatively, you can press <b>Meta-V</b> and then type a six-digit hexadecimal code from 000000 to 10FFFF (case-insensitive), and the character with the corresponding value will be inserted instead.</p></blockquote>
 <h2><a name="3.9"></a>3.9. How do I make a .nanorc file that nano will read when I start it?</h2>
 <blockquote><p>It's not hard at all! But, your version of nano must have been compiled with <b>--enable-nanorc</b>, and again must be version 1.1.12 or newer (use nano -V to check your version and compiled features). Then simply copy the <b>nanorc.sample</b> that came with the nano source or your nano package (most likely in /usr/doc/nano) to .nanorc in your home directory. If you didn't get one, the syntax is simple. Flags are turned on and off by using the word <b>set</b> and the getopt_long flag for the feature, for example &quot;set nowrap&quot; or &quot;set suspend&quot;.</p></blockquote>
 <hr width="100%">
@@ -250,6 +250,7 @@
 <h2><a name="8"></a>8. ChangeLog</h2>
 <blockquote>
 <p>
+2005/08/08 - Update section 3.8 to mention that verbatim input mode now takes a six-digit hexadecimal number. (DLR)<br>
 2005/07/04 - Update section 4.10 to mention that pasting from the X clipboard via the middle mouse button also works when the Shift key is used.<br>
 2005/06/15 - Update description of --enable-extra, and add missing line breaks. (DLR)<br>
 2005/06/13 - Minor capitalization and wording fixes. (DLR)<br>
diff --git a/src/chars.c b/src/chars.c
index c7f61038..bb41f600 100644
--- a/src/chars.c
+++ b/src/chars.c
@@ -888,8 +888,8 @@ bool has_blank_mbchars(const char *s)
  * ranges D800-DFFF or FFFE-FFFF), and FALSE otherwise. */
 bool is_valid_unicode(wchar_t wc)
 {
-    return (0 <= wc && (wc <= 0xD7FF || 0xE000 <= wc) && (wc !=
-	0xFFFE && wc != 0xFFFF));
+    return (0 <= wc && (wc <= 0xD7FF || 0xE000 <= wc) && (wc <=
+	0xFFFD || 0x10000 <= wc));
 }
 #endif
 
diff --git a/src/winio.c b/src/winio.c
index 3ecde585..ade2f6ee 100644
--- a/src/winio.c
+++ b/src/winio.c
@@ -1232,9 +1232,9 @@ int get_byte_kbinput(int kbinput
     return retval;
 }
 
-/* Translate a Unicode sequence: turn a four-digit hexadecimal number
- * from 0000 to FFFF (case-insensitive) into its corresponding multibyte
- * value. */
+/* Translate a Unicode sequence: turn a six-digit hexadecimal number
+ * from 000000 to 10FFFF (case-insensitive) into its corresponding
+ * multibyte value. */
 long get_unicode_kbinput(int kbinput
 #ifndef NANO_SMALL
 	, bool reset
@@ -1253,15 +1253,41 @@ long get_unicode_kbinput(int kbinput
     }
 #endif
 
-    /* Increment the word digit counter. */
+    /* Increment the Unicode digit counter. */
     uni_digits++;
 
     switch (uni_digits) {
 	case 1:
 	    /* One digit: reset the Unicode sequence holder and add the
-	     * digit we got to the 0x1000's position of the Unicode
+	     * digit we got to the 0x100000's position of the Unicode
 	     * sequence holder. */
 	    uni = 0;
+	    if ('0' <= kbinput && kbinput <= '1')
+		uni += (kbinput - '0') * 0x100000;
+	    else
+		/* If the character we got isn't a hexadecimal digit, or
+		 * if it is and it would put the Unicode sequence out of
+		 * valid range, save it as the result. */
+		retval = kbinput;
+	    break;
+	case 2:
+	    /* Two digits: add the digit we got to the 0x10000's
+	     * position of the Unicode sequence holder. */
+	    if ('0' == kbinput || (uni < 0x100000 && '1' <= kbinput &&
+		kbinput <= '9'))
+		uni += (kbinput - '0') * 0x10000;
+	    else if (uni < 0x100000 && 'a' <= tolower(kbinput) &&
+		tolower(kbinput) <= 'f')
+		uni += (tolower(kbinput) + 10 - 'a') * 0x10000;
+	    else
+		/* If the character we got isn't a hexadecimal digit, or
+		 * if it is and it would put the Unicode sequence out of
+		 * valid range, save it as the result. */
+		retval = kbinput;
+	    break;
+	case 3:
+	    /* Three digits: add the digit we got to the 0x1000's
+	     * position of the Unicode sequence holder. */
 	    if ('0' <= kbinput && kbinput <= '9')
 		uni += (kbinput - '0') * 0x1000;
 	    else if ('a' <= tolower(kbinput) && tolower(kbinput) <= 'f')
@@ -1272,8 +1298,8 @@ long get_unicode_kbinput(int kbinput
 		 * valid range, save it as the result. */
 		retval = kbinput;
 	    break;
-	case 2:
-	    /* Two digits: add the digit we got to the 0x100's position
+	case 4:
+	    /* Four digits: add the digit we got to the 0x100's position
 	     * of the Unicode sequence holder. */
 	    if ('0' <= kbinput && kbinput <= '9')
 		uni += (kbinput - '0') * 0x100;
@@ -1285,8 +1311,8 @@ long get_unicode_kbinput(int kbinput
 		 * valid range, save it as the result. */
 		retval = kbinput;
 	    break;
-	case 3:
-	    /* Three digits: add the digit we got to the 0x10's position
+	case 5:
+	    /* Five digits: add the digit we got to the 0x10's position
 	     * of the Unicode sequence holder. */
 	    if ('0' <= kbinput && kbinput <= '9')
 		uni += (kbinput - '0') * 0x10;
@@ -1298,8 +1324,8 @@ long get_unicode_kbinput(int kbinput
 		 * valid range, save it as the result. */
 		retval = kbinput;
 	    break;
-	case 4:
-	    /* Four digits: add the digit we got to the 1's position of
+	case 6:
+	    /* Six digits: add the digit we got to the 1's position of
 	     * the Unicode sequence holder, and save the corresponding
 	     * Unicode value as the result. */
 	    if ('0' <= kbinput && kbinput <= '9') {
@@ -1316,14 +1342,14 @@ long get_unicode_kbinput(int kbinput
 		retval = kbinput;
 	    break;
 	default:
-	    /* More than four digits: save the character we got as the
+	    /* More than six digits: save the character we got as the
 	     * result. */
 	    retval = kbinput;
 	    break;
     }
 
-    /* If we have a result, reset the word digit counter and the word
-     * sequence holder. */
+    /* If we have a result, reset the Unicode digit counter and the
+     * Unicode sequence holder. */
     if (retval != ERR) {
 	uni_digits = 0;
 	uni = 0;