It is time to write a new version of my application to pull photos off a memory stick and organize them as I like them to be.  Our current version was a quick hack to get something working when we got our digital camera a couple of years ago.  It was hacked over and over, and it ended up being very tightly bound to a certain use case.  Our use case has now changed since we have a Windows Home Server.  It's time for a more robust solution that will survive future process changes.

I got to the ThumbnailCreator class, and I need to filter the list of files to process in 2 different fashions:

  1. Don't create thumbnails of photos that themselves are thumbnails
  2. Optionally don't overwrite existing thumbnails

I want to respect these 2 rules so that I can point the Thumbnail creator at any folder and tell it to run, and get the results I want.  In the current version of my photo organization process, I have to keep moving the files to new folders to ensure that the sub-processes have clean working surfaces.  But in the new version, I want the sub-processes to adapt to the environment without causing trouble and without having to jump from folder to folder.

I determined I would need a method with the following signature:

   1: private IEnumerable<FileInfo> GetPhotosToProcess(FileInfo[] photos)


I created the following test:

   1: public void GetPhotosToProcessTest()
   2: {
   3:     FileInfo[] photos = new FileInfo[10]
   4:     {
   5:   new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-37-07.jpg"),
   6:   new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-37-07.small.jpg"),
   7:   new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-38-07.jpg"),
   8:   new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-38-07.small.jpg"),
   9:   new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-39-07.jpg"),
  10:   new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-39-07.small.jpg"),
  11:   new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-40-07.jpg"),
  12:   new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-40-07.small.jpg"),
  13:   new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-41-07.jpg"),
  14:   new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-41-07.small.jpg")
  15:     };
  16:  
  17:     ThumbnailCreator_Accessor target = new ThumbnailCreator_Accessor(new DirectoryInfo(@"C:\"), false, true);
  18:     IEnumerable<FileInfo> actual = target.GetPhotosToProcess(photos);
  19:     Assert.AreEqual(5, actual.Count(), "When Replace = true, the actual thumbnails should be removed, but all others should remain.");
  20:  
  21:     target = new ThumbnailCreator_Accessor(new DirectoryInfo(@"C:\"), false, false);
  22:     actual = target.GetPhotosToProcess(photos);
  23:     Assert.AreEqual(0, actual.Count(), "When Replace = false, files with existing thumbnails should be removed.");
  24: }


And this was the code I ended up testing:

   1: private IEnumerable<FileInfo> GetPhotosToProcess(FileInfo[] photos)
   2: {
   3:   // Create a Regex based on the output filename format, where we will
   4:   // look for files that match the format.  Therefore, we replace {0}
   5:   // with .+, meaning one or more character.  If a file matches
   6:   // this Regex, then it is a thumbnail itself so we don't want to
   7:   // create a thumbnail of it.
   8:     Regex thumbPattern = new Regex(string.Format(this.OutputFilenameFormat, @".+"));
   9:  
  10:   // We will subtract the existing thumbnails from the photos list
  11:     IEnumerable<FileInfo> photosToProcess = photos.Except(photos.Where(p => thumbPattern.IsMatch(p.Name)));
  12:   
  13:   // If we're not supposed to replace existing thumbnails, subtract the photos that
  14:   // already have existing thumbnails.  Be sure to use our FileInfoComparer, otherwise
  15:   // Contains() will always return false.
  16:   if (!this.ReplaceExistingThumbnails)
  17:     {
  18:         FileInfoComparer comparer = new FileInfoComparer();
  19:         photosToProcess = photosToProcess.Except(photosToProcess.Where(p => photos.Contains(ThumbnailFile(p), comparer)));
  20:     }
  21:  
  22:   // Return the filtered list
  23:   return photosToProcess;
  24: }


At one point I actually combined the 2 steps into a single step, but the code was too hard to read.  What I have here feels really nice.  I don't even want to try to think up how I would have implemented these 2 rules without Linq; it's no wonder the previous code just kept creating new folders for every sub-process.

Something noteworthy though... The Contains statement executes against the original photos array rather than against photosToProcess, which is already filtered.  When I had photosToProcess.Contains, I got an infinite loop resulting in a stack overflow.

 

Follow-up...

Okay, it turns out, the old code did respect these 2 rules as well, but in a slightly different fashion:

   1: foreach (FileInfo file in folder.GetFiles("*.jpg"))
   2: {
   3:   if (!file.FullName.EndsWith(".small.jpg") && !System.IO.File.Exists(file.FullName.Replace(".jpg", ".small.jpg")))
   4:         ShrinkImage(file, jpgInfo, encoderParams);
   5: }


While that did the trick, the old code was nowhere near testable and it was definitely hard-coded to be very specific.