File copy over network

vt92 · ‎08-18-2009

I would like some suggestions on how to improve a program that I'm working on. At first glance, the task seems simple: Monitor a folder on a network drive and copy any new file as it appears.

Challenges:
1.) The scan folder has ~ 2000 files
2.) Each file gets written in segments. I don't want the file until the write operation is complete. The time between the first write and completion can be as much as 20 minutes.

Here is the flow of my program now:

1.) Initialize, get file paths, etc.
2.) Read all the files in the folder into an array.
3.) Parse out the new files. Place these files in a queue, which is processed (copy operation) in a seperate loop. The copy function is setup to overwrite the file if it already exists.

Is there a better (faster) way to get a list of new files?

What is the best way to make sure the file is complete before writing? The file contents vary, so there is no good "flag" to look for in the file. Currently, I am just overwriting all files modified during the last hour, but somehow I still am getting incomplete files.

>

"There is a God shaped vacuum in the heart of every man which cannot be filled by any created thing, but only by God, the Creator, made known through Jesus." - Blaise Pascal

Jarrod_S. · ‎08-18-2009

I don't know of a good way to tell when a write operation has been completed on a file. But there's a very good workaround that completely gets around this problem. Don't write your active files in the scan folder. Instead, write the files in some temporary location. When the write operation is through, close the file and copy it to the scan folder. This means that there is never an incomplete file in the scan folder, and any new additions are ready to be copied over the network.

The scanning challenge is something that I would let the OS do for you, rather than polling the file list yourself. There is a .NET Event you can register on a folder to get updates (new files added, etc.) without wasting CPU cycles. I can't remember the .NET library to use, but if I find it with a quick Google search, I'll let you know.

Jarrod S.
National Instruments

Jarrod_S. · ‎08-18-2009

The .NET class is called FileSystemWatcher, from System.IO.

Jarrod S.
National Instruments

vt92 · ‎08-19-2009

Jarrod,

I don't have control over the system that writes the files.

I'll take a look at the .NET class.

>

"There is a God shaped vacuum in the heart of every man which cannot be filled by any created thing, but only by God, the Creator, made known through Jesus." - Blaise Pascal

smercurio_fc · ‎08-19-2009

I had posted an example of using the FileSystemWatcher class here.

LabBEAN · ‎04-23-2014

Another way to detect that a file is ready to move/copy: after writing is complete, set the read-only flag. After transferring the file, un-set read-only which tells you the move/copy is complete on the other side.

-Jason

VT '00

Certified LabVIEW Architect
TestScript: Free Python/LabVIEW Connector

One global to rule them all,
One double-click to find them,
One interface to bring them all
and in the panel bind them.

RavensFan · ‎04-23-2014

Setting the read only flag sounds like a good idea.

But actually, the archive flag is the setting that historically was meant to be used for this purpose. Whether it still is or not, I don't know. Do OS's still set and clear this flag on file copies? It was meant as a way for backup programs to recognize whether a file is new or previously has been backed up.

LabBEAN · ‎04-23-2014

The read-only flag method is actually surprisingly simple and effective in the field:

Read-only flag:

-not generally used by OS, processes, applications

-set FALSE by default (upon file creation)

-when TRUE, signifies "no writing" (or in our case, write complete)

Archive flag:

-may or may not be used by backup applications

-traditionally set TRUE by default (upon file creation) or when file change occurs

-when TRUE, signifies "needs backup"

Using Archive would introduce risk since we'd be looking for a TRUE flag to signify start copy and the flag is TRUE upon file creation and can be manipulated by the OS (e.g. if the file is written to) and by backup applications.

Consideration should be given to whether one wants to keep a local copy. To prevent bloat in the scan directory, identify another local archive directory or series of directories.

If deleting local copy:

After write or copy to scan directory is complete, set Read-only. Scan for Read-only files. Copy to network. When copy is complete, un-set remote Read-only. Delete local copy.

If keeping the local copy:

After write or copy to scan directory is complete, set Read-only. Scan for Read-only files. Copy to network. When copy is complete, un-set remote Read-only. Copy to local archive directory. When copy is complete, un-set local archive Read-only. Delete original local copy from the scan directory.

Certified LabVIEW Architect
TestScript: Free Python/LabVIEW Connector

One global to rule them all,
One double-click to find them,
One interface to bring them all
and in the panel bind them.

LabVIEW

File copy over network

File copy over network

Re: File copy over network

Re: File copy over network

Re: File copy over network

Re: File copy over network

Re: File copy over network

Re: File copy over network

Re: File copy over network