Spread the love

I’m sure you can understand that it takes time to finish off your first edit. This, being my first full novel edit, happens to be quite tedious and time consuming. I am trying my best to move further, slightly quicker.

NaNoE Additions

There is a slight stumble in what I have seen since moving into editing my first draft of Accidental Distances.

We should also pay attention to spelling. Going through the editing grind I saw constant flaws in the spelling that was easy to fix. Since that can help out with the writing timing it makes sense to have that helping out in the editing for grammar and removing useless words.

            // =========================================================
            // = Checks added                                          =
            // =========================================================

            // Spelling
            var splt = para.Split(' ');
            for (int i = 0; i < splt.Length; i++)
            {
                if (!SpellCheck(splt[i])) ans.Add("{" + i.ToString() + "} Spelling Error: " + splt[i]);
            }
...
        private static bool SpellCheck(string v)
        {
            throw new NotImplementedException();
        }

That is where I chose to put it, within NaNoEdit.cs. Next, we get the file with the list of English words for ourselves. We need to do a small number of things with the list of words. This obviously lies inside NaNoEdit.cs as well:

        /// <summary>
        /// The dictionary
        /// </summary>
        private static List<string> Words;

        /// <summary>
        /// Load the file, generate the array, remove the file itself from memory
        /// </summary>
        public static void Init()
        {
            if (Words != null) return;

            Words = new List<string>();
            using (FileStream fs = new FileStream("words.txt", FileMode.Open))
            {
                using (StreamReader sr = new StreamReader(fs))
                {
                    string line;
                    while ((line = sr.ReadLine()) != null)
                    {
                        Words.Add(line);
                    }
                }
            }
        }

        /// <summary>
        /// Checks if the word is within the dictionary
        /// </summary>
        /// <param name="v">Word to check</param>
        /// <returns>True/False</returns>
        private static bool SpellCheck(string v)
        {
            throw new NotImplementedException();
        }

As you can see, it isn’t completely implemented here yet. I figured it is first to discuss the search. The easiest way we can find if the word is within the dictionary is a binary search. It is something that many people that have their degrees in Computer Science should be able to do, for some reason a few of them struggle.

Just note that because everything is alphabetical we can do a simple binary search. I went through it to try to see if I could get it right and unfortunately have it more or less working, it just isn’t…

        /// <summary>
        /// Checks if the word is within the dictionary
        /// </summary>
        /// <param name="v">Word to check</param>
        /// <returns>True/False</returns>
        private static bool SpellCheck(string v)
        {
            // Initiate borders
            var current = v.ToUpper();
            int lower = 0;
            int upper = Words.Count;
            int position = upper / 2;
            int character = 0;
            DictionaryDirection dictionaryDirection = DictionaryDirection.WhatNext;

            // the check and search
            while (upper != lower)
            {
                var hereWeAre = Words[position].ToUpper();
                if (hereWeAre == v) return true;

                // First finding the starting characters
                if ((int)(current[character]) > (int)(hereWeAre[character]))
                {
                    dictionaryDirection = DictionaryDirection.GoUp;
                }
                else if ((int)(current[character]) < (int)(hereWeAre[character]))
                {
                    dictionaryDirection = DictionaryDirection.GoDown;
                }
                else // This is assuming "current starting character = hereWeAre starting character" characters
                {
                    // We stick within the limits
                    for (int i = 1; (i < current.Length) && (i < hereWeAre.Length) && (dictionaryDirection == DictionaryDirection.WhatNext); i++)
                    {
                        if ((int)(current[character + i]) > (int)(hereWeAre[character + i]))
                        {
                            dictionaryDirection = DictionaryDirection.GoUp;
                        }
                        else if ((int)(current[character + i]) < (int)(hereWeAre[character + i]))
                        {
                            dictionaryDirection = DictionaryDirection.GoDown;
                        }
                    }

                    if (dictionaryDirection == DictionaryDirection.WhatNext) return false; // This is also flawed, however we are doing the minimal design
                }

                // Go a direction
                switch (dictionaryDirection)
                {
                    case DictionaryDirection.GoUp:
                        lower = position;
                        position = (upper - lower) / 2 + lower;
                        dictionaryDirection = DictionaryDirection.WhatNext;
                        break;
                    case DictionaryDirection.GoDown:
                        upper = position;
                        position = (upper - lower) / 2 + lower;
                        dictionaryDirection = DictionaryDirection.WhatNext;
                        break;
                }

                if (upper - lower <= 1) return false;
            }

            // This can be flawed with the characters used in paragraphs, but still minimises errors
            return false;
        }

The idea is simple, work out where the middle of the list is. Check the middle element. Starting with the first character in the two words is it greater than or less than. If greater move to the top half, if less than move to the lower half, if they’re the same go through the rest of the words to check upper and lower halves.

This also brought it to my attention that the words.txt wasn’t actually alphabetical in terms of characters. So I tried using sortmylist.com to see if it made a difference.

Clearly it wasn’t

The next step is to try that again and see what it does, step by step:

  • Breakpoint at the start of SpellCheck, start running debug, start editing with only one word: this.
  • v = “this”, current = “THIS”, lower = 0, upper = 466552, position = 233276, dictionaryDirection = WhatNext
    First char check: THIS versus INFALL => Go Up
  • v = “this”, current = “reimmerge”, lower = 233276, upper = 466552, position = 349914
    First char check: THIS versus REIMMERGE => Go Up
  • v = “this”, current = “tempest-tossed”, lower = 349914, upper = 466552, position = 408233
    First char check: THIS versus TEMPEST-TOSSED => Go Up
  • v = “this”, current = “unket”, lower = 408233, upper = 46652, position = 437392
    First char check: THIS versus UNKET => Go Down
  • v = “this”, current = “turbulidentate”, lower = 408233, upper = 437392, position = 422812
    First char check: THIS versus TURBULIDENTATE => Go Down
  • v = “this”, current = “tolsel”, lower = 408233, upper = 422812, position = 415522
    First char check: THIS versus TOLSEL => Go Down
  • v = “this”, current = “thirteen-ringed”, lower = 408233, upper = 415522, position = 411877
    First char check: THIS versus => Go Up
  • v = “this”, current = “tide-generating”, lower = 411877, upper = 415522, position = 413699
    First char check: THIS versus TIDE-GENERATING => Go Down
  • v = “this”, current = “thrilly”, lower = 411877, upper = 413699, position = 412788
    First char check: THIS versus THRILLY => Go Down
  • v = “this”, current = “thrashing-floor”, lower = 411877, upper = 412788, position = 412332
    First char check: THIS versus THRASHING-FLOOR => Go Down
  • v = “this”, current = “thornier”, lower = 411877, upper = 412332, position = 412104
    First char check: THIS versus THORNIER => Go Down
  • v = “this”, current = “tholi”, lower = 411877, upper = 412104, position = 411990
    First char check: THIS versus THOLI => Go Down
  • Here it stops and goes straight to “edit this word, it isn’t correct“.

Take note here, there are still 227 odd words here. While stepping through I figured it possible skips the upper and lower, I should potentially make it check position, upper, and also lower.

The next word happens to be this-worldian, and it points out the literal bug here. The for loop we use goes through the 4 characters and checks, afterwards if it still couldn’t work it out it needs to swap to a new central word in the dictionary.

It became a simple adjustment, replacing a line (the comment) with the check:

                    //if (dictionaryDirection == DictionaryDirection.WhatNext) return false; // This is also flawed, however we are doing the minimal design
                    if (dictionaryDirection == DictionaryDirection.WhatNext)
                    {   
                        if (current == hereWeAre)
                        {
                            return true;
                        }
                        position--;
                    }
It ignored “this”, as it is correct.

Well now, I have thought of potential clashes here. We should also remove full-stops, hyphens, and commas… Well, more than that. It is just to test without that next fix. This brought yet another little problem, it gets into long loops when the names start with the same. So instead of position– I figure it would be best to accidentally make it go down.

dictionaryDirection = DictionaryDirection.GoDown;

That is a slight assumption that if the word is shorter it will definitely be to the left. I believe the dictionary is sorted like that, so I will just leave it here classing it as a success.

Perhaps not, we need to delve deeper.

It turns out that isn’t quite what I expected. Oh dear, time to step through to see where it seems to accidentally get the incorrect direction. For the central position it goes:

infall
bossdoms
edelweiss
coproduce
cheek
camel-shaped
catchpoll
carboxylation
cantatrici
cancha
cample
canal-bone
can-boxing
campus's
camshaft
camuning
camwood
can't
can's
FALSE

Well, it stops at can’s, so obviously, that will be where I have to step through. Looking through words.txt, it excludes the word can for some reason. Oh well, I will source a replacement dictionary. Hunting around, I found a source shared on similar questions that others had.

I swapped the contents of words.txt with Unix.dict. It is quite amusing that, like in the past, I had to resort to using Unix sources for projects. I will one day move over to use it, like I did in the past, I’m just sticking within my hard drive sizes for now.

We can count that as a win!

There we go, commit, that should be fine. Now the tiny bug fix for what we would love to have in this. There are fine lines by what it should pay attention to, we will just stick to removing ‘.’, ‘;’, ‘:’ and ‘,’. Keep it simple. Scratching that already just as I’m implementing it, we will just take all punctuation out with Linq.

            // Spelling
            var splt = para.Split(' ');
            for (int i = 0; i < splt.Length; i++)
            {
                var whichUsed = new string(splt[i].ToCharArray(0,splt[i].Length).Where(c => !char.IsPunctuation(c)).ToArray());
                if (!SpellCheck(whichUsed)) ans.Add("{" + i.ToString() + "} Spelling Error: " + splt[i]);
            }

I can be honest, that happens to go too far. I will be swapping it to just the end of line characters now. The Spelling code gets simpler than above:

            // Spelling
            var splt = para.Split(' ');
            for (int i = 0; i < splt.Length; i++)
            {
                //var whichUsed = new string(splt[i].ToCharArray(0,splt[i].Length).Where(c => !char.IsSeparator(c)).ToArray());
                var whichUsed = splt[i];
                while (whichUsed.EndsWith(".") || whichUsed.EndsWith(",") || whichUsed.EndsWith(";") || whichUsed.EndsWith(":") || whichUsed.EndsWith(" "))
                {
                    whichUsed = whichUsed.Remove(whichUsed.Length - 1);
                }
                if (!SpellCheck(whichUsed)) ans.Add("{" + i.ToString() + "} Spelling Error: " + splt[i]);
            }

It isn’t completely there, but I feel like calling it quits for today.

The Chapter Grind

It is definitely something that comes at a slow pace, reading your own novel for the first time you can regularly find flaws for yourself. The easiest I’ve found is to put it into my OneDrive. From there I can put it in a format that fits perfectly on my mobile screen. On my phone, I use Microsoft Word and can easily make comments for myself.

As you can see, the way it is all set up I can go through all the comments on my computer. The only thing that is delaying this happens to be that I am reading it slowly and carefully. When I have all these comments I will move onto finalizing the first draft. I can see flaws in my ideas for the writing currently, and I would prefer to have that complete before I will do a read where I try not touch anything.

The other side that I saw tons of flaws in was spelling, that is why we have now added that to NaNoE. You can no doubt understand it brought some stumbling delays to my first draft, it just happens to be what I believe I will need for any thing I write using NaNoE in the future.

Facebook Comments