1. Kari Paukku
  2. PowerBuilder
  3. Saturday, 25 February 2023 08:53 AM UTC

Hi,

What would be the best way to strip out all illegal characters that windows does not allow in file names?

Characters like question mark(?), colon (;).

We create files (.msg) from the emails we receive and one option is to use self generated file names, like FILE01.MSG and that works.

But is there a away (e.g. regex) to strip out the illegal characters from e.g. Subject field - that would make the file name more informative.

Thanks.

kp

 

 

 

 

Accepted Answer
John Fauss Accepted Answer Pending Moderation
  1. Sunday, 26 February 2023 18:18 PM UTC
  2. PowerBuilder
  3. # Permalink

Hi, Kari - 

There is a Windows API function that will do this for you, called PathCleanupSpec. Here is the PB external function declaration:

FUNCTION Long PathCleanupSpec ( &
   String     pszDir, &
   REF String pszSpec &
   ) LIBRARY "Shell32.dll"

The first argument contains the drive and directory path for the file. The second argument contains the file's filename and extension (the file specification), and may be modified by the API function if it contains any invalid characters, which is why it must be passed by reference.

If everything's ok, the return code is zero, otherwise one or more of the following flags will be returned:

Constant Long PCS_REPLACEDCHAR = 1
Constant Long PCS_REMOVEDCHAR  = 2
Constant Long PCS_TRUNCATED    = 4
Constant Long PCS_PATHTOOLONG  = 8
Constant Long PCS_FATAL        = 2147483648

Here is an example of how to call this WinAPI function from PB:

Long   ll_rc
String ls_drive_and_dir, ls_orig_filespec, ls_filespec

ls_drive_and_dir = "C:\Windows\"
ls_orig_filespec = "*This <file specification> contains **INVALID** characters???.t|x|t"
ls_filespec      = ls_orig_filespec

ll_rc = PathCleanupSpec(ls_drive_and_dir,ls_filespec) // Issues RC=2

MessageBox("Path Cleanup Spec","RC = " + String(ll_rc) + &
   "~r~n~r~nDir:~t"    + ls_drive_and_dir + &
   "~r~n~r~nBefore:~t" + ls_orig_filespec + &
   "~r~n~r~nAfter:~t"  + ls_filespec)

ls_drive_and_dir = "C:\"
ls_orig_filespec = "This file specification is valid!.pdf"
ls_filespec      = ls_orig_filespec

ll_rc = PathCleanupSpec(ls_drive_and_dir,ls_filespec) // Issues RC=0

MessageBox("Path Cleanup Spec","RC = " + String(ll_rc) + &
   "~r~n~r~nDir:~t"    + ls_drive_and_dir + &
   "~r~n~r~nBefore:~t" + ls_orig_filespec + &
   "~r~n~r~nAfter:~t"  + ls_filespec)

The following URL is the documentation for this Windows API function:

   https://learn.microsoft.com/en-us/windows/win32/api/shlobj_core/nf-shlobj_core-pathcleanupspec

The documentation includes a list of the characters that are invalid in a file specification.

Best regards, John

Comment
  1. Miguel Leeuwe
  2. Sunday, 26 February 2023 19:30 PM UTC
Very useful!
  1. Helpful
  1. Benjamin Gaesslein
  2. Monday, 27 February 2023 14:07 PM UTC
I have a function that just strips any character that isn't in the alphanumerical range (enough for my use case) but this is the proper way to do it.
  1. Helpful
There are no comments made yet.
Chris Pollach @Appeon Accepted Answer Pending Moderation
  1. Saturday, 25 February 2023 17:06 PM UTC
  2. PowerBuilder
  3. # 1

Hi Kari;

  I would write a routine that loops through the file name character by character, checking each one for compliance. If not in compliance, then drop it. That also applies for the file name suffix as well.

Regards... Chris 

Comment
There are no comments made yet.
  • Page :
  • 1


There are no replies made for this question yet.
However, you are not allowed to reply to this question.