It is time to write a new version of my application to pull photos off a memory stick and organize them as I like them to be. Our current version was a quick hack to get something working when we got our digital camera a couple of years ago. It was hacked over and over, and it ended up being very tightly bound to a certain use case. Our use case has now changed since we have a Windows Home Server. It's time for a more robust solution that will survive future process changes.
I got to the ThumbnailCreator class, and I need to filter the list of files to process in 2 different fashions:
- Don't create thumbnails of photos that themselves are thumbnails
- Optionally don't overwrite existing thumbnails
I want to respect these 2 rules so that I can point the Thumbnail creator at any folder and tell it to run, and get the results I want. In the current version of my photo organization process, I have to keep moving the files to new folders to ensure that the sub-processes have clean working surfaces. But in the new version, I want the sub-processes to adapt to the environment without causing trouble and without having to jump from folder to folder.
I determined I would need a method with the following signature:
1: private IEnumerable<FileInfo> GetPhotosToProcess(FileInfo[] photos)
I created the following test:
1: public void GetPhotosToProcessTest()
2: {
3: FileInfo[] photos = new FileInfo[10]
4: {
5: new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-37-07.jpg"),
6: new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-37-07.small.jpg"),
7: new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-38-07.jpg"),
8: new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-38-07.small.jpg"),
9: new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-39-07.jpg"),
10: new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-39-07.small.jpg"),
11: new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-40-07.jpg"),
12: new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-40-07.small.jpg"),
13: new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-41-07.jpg"),
14: new FileInfo(@"E:\2007\2007-03\2007-03-07\2007-03-07_20-41-07.small.jpg")
15: };
16:
17: ThumbnailCreator_Accessor target = new ThumbnailCreator_Accessor(new DirectoryInfo(@"C:\"), false, true);
18: IEnumerable<FileInfo> actual = target.GetPhotosToProcess(photos);
19: Assert.AreEqual(5, actual.Count(), "When Replace = true, the actual thumbnails should be removed, but all others should remain.");
20:
21: target = new ThumbnailCreator_Accessor(new DirectoryInfo(@"C:\"), false, false);
22: actual = target.GetPhotosToProcess(photos);
23: Assert.AreEqual(0, actual.Count(), "When Replace = false, files with existing thumbnails should be removed.");
24: }
And this was the code I ended up testing:
1: private IEnumerable<FileInfo> GetPhotosToProcess(FileInfo[] photos)
2: {
3: // Create a Regex based on the output filename format, where we will
4: // look for files that match the format. Therefore, we replace {0}
5: // with .+, meaning one or more character. If a file matches
6: // this Regex, then it is a thumbnail itself so we don't want to
7: // create a thumbnail of it.
8: Regex thumbPattern = new Regex(string.Format(this.OutputFilenameFormat, @".+"));
9:
10: // We will subtract the existing thumbnails from the photos list
11: IEnumerable<FileInfo> photosToProcess = photos.Except(photos.Where(p => thumbPattern.IsMatch(p.Name)));
12:
13: // If we're not supposed to replace existing thumbnails, subtract the photos that
14: // already have existing thumbnails. Be sure to use our FileInfoComparer, otherwise
15: // Contains() will always return false.
16: if (!this.ReplaceExistingThumbnails)
17: {
18: FileInfoComparer comparer = new FileInfoComparer();
19: photosToProcess = photosToProcess.Except(photosToProcess.Where(p => photos.Contains(ThumbnailFile(p), comparer)));
20: }
21:
22: // Return the filtered list
23: return photosToProcess;
24: }
At one point I actually combined the 2 steps into a single step, but the code was too hard to read. What I have here feels really nice. I don't even want to try to think up how I would have implemented these 2 rules without Linq; it's no wonder the previous code just kept creating new folders for every sub-process.
Something noteworthy though... The Contains statement executes against the original photos array rather than against photosToProcess, which is already filtered. When I had photosToProcess.Contains, I got an infinite loop resulting in a stack overflow.
Follow-up...
Okay, it turns out, the old code did respect these 2 rules as well, but in a slightly different fashion:
1: foreach (FileInfo file in folder.GetFiles("*.jpg"))
2: {
3: if (!file.FullName.EndsWith(".small.jpg") && !System.IO.File.Exists(file.FullName.Replace(".jpg", ".small.jpg")))
4: ShrinkImage(file, jpgInfo, encoderParams);
5: }
While that did the trick, the old code was nowhere near testable and it was definitely hard-coded to be very specific.