| Tips & Tricks - Index / Dcount |
This combination of statements will allow you to search a multivalued field for a match of a partial string.
Let's say that you have a multi-valued field of telephone numbers in a customer record, and as has happened so frequently in the last few years, new area codes have been assigned to specific telephone exchanges. In order to update and change the area code, the INDEX/DCOUNT combo might be useful.
For this example, we will need to find telephone numbers with an area code of 708 and an exchange of 351 and then change the area code to 630.
|
| Atb |
Data |
|
| 1 |
Bakken Software |
Company Name |
| 2 |
358 Army Trail Road #140 - 181 |
Street Address |
| 3 |
Bloomingdale |
City |
| 4 |
IL |
State |
| 5 |
60108 |
Zipcode |
| 6 |
800/742-5911]708/351-1052]708/351-1802 |
Telephone |
| |
(multivalue mark identified in red above) |
|
| |
|
Atb. 6 contains three telephone numbers in a multivalued field, as listed below |
| |
|
800/742-5911 |
| |
|
708/351-1052 |
| |
|
708/351-1802 |
| |
|
|
|
|
One method of identifying the area codes that need to be updated might be to:
1) Determine how many values exist within attribute 6 2) Loop through each of the values, parsing apart the telephone number 3) Checking each area code to see if the area code is to be changed
This would definitely work, but a more efficient method would be to use the INDEX/DCOUNT combo.
|
| POS = INDEX(CUST.REC<6>,"708/351",1) |
The INDEX function will search the string of |
| |
800/742-5911]708/351-1052]708/351-1802 for the 1st occurence of |
| |
the substring or partial value of 708/351 |
| |
Format of INDEX: |
| |
var = INDEX(STRINGTOSEARCH , SUBSTRING , OCCURRENCE) |
| |
In our example to the left, POS will be set to 14 as the substring |
| |
708/351 starts at the 14th character in the string |
| |
800/742-5911]708/351-1052]708/351-1802 |
| |
14th character |
| |
|
| MV.POS = DCOUNT(CUST.REC<6>[1,POS],CHAR(253)) |
The DCOUNT function will count the number of values separated |
| |
by the specified separator (multi-value mark ASCII CHAR 253) |
| |
starting with the 1st through the 14th character of the string |
| |
The result will be stored in the variable MV.POS |
| |
800/742-5911]708/351 |
| |
2 values separated by a multi-value mark |
| |
MV.POS will equal 2 |
| |
|
|
We now know that the 1st occurrence of the area code to be changed exists as the 2nd multi-valued field in attribute 6 of the customer record, and would need to now make the necessary change.
As this is a multi-valued field and more than one value might fit the condition under which the area code will need to be changed, we most likely would include this code within some type of loop in order to insure that all the telephone numbers needing to be updated, are indeed updated.
The full segment of code might look like the following:
|
| LOOP |
Begin looping through data to find all occurrences |
| POS = INDEX(CUST.REC<6>,"708/351",1) |
Find the character position of the substring if it exists |
| MV.POS = DCOUNT(CUST.REC<6>[1,POS],CHAR(253)) |
Find the multivalue position of the substring within the attribute |
| UNTIL MV.POS = 0 DO |
As long as the substring is found, do the following |
| PHONE = CUST.REC<6,MV.POS> |
Extract the full telephone number to be changed from the atb. |
| SUFFIX = OCONV(PHONE,"G1-1") |
Parse apart the phone# to obtain the suffix |
| CUST.REC<6,MV.POS> = "630/351-":SUFFIX |
Replace data existing telephone# with new telephone# |
| REPEAT |
Repeat process - check if another occurrence exists |
| |
|
| |
This example is shown using a dynamic array for the customer |
| |
information. Obviously a READU statement should be used prior |
| |
to this segment of code in order to lock the record while updating, |
| |
and a WRITE statement needs to follow the segment of code in order |
| |
to update the customer file with the updated telephone numbers |
| |
|
|
Here's another example of how this combination of statements might come in handy. Remember, this is used for searching a multivalued field for a match of a partial string.
In our customer file we might have a multi-valued field of open invoice information contained within the customer record, and to make things a bit more complicated, not only is the invoice# within a multi-valued field, it is also concatenated to the invoice date in which is in internal format.
(and yes, there probably is a better structure for this data, but this is from an actual client's site for which BSS was responsible for the maintenance of existing software and had come on board long after the original software was written.)
|
| Atb |
Data |
|
| 1 |
Bakken Software |
Company Name |
| 2 |
358 Army Trail Road #140 - 181 |
Street Address |
| 3 |
Bloomingdale |
City |
| 4 |
IL |
State |
| 5 |
60108 |
Zipcode |
| 6 |
630-351-1052 |
Telephone |
| 7 |
14701*100120]14722*100227]14750*100331 |
Invoice Date * Invoice# - bracket in red denotes the mv mark |
| |
|
Atb. 7 contains three multivalued fields, as listed below |
| |
|
14701*100120 (invoice date 03/31/08 Invoice# 100120) |
| |
|
14722*100227 (invoice date 04/21/08 Invoice# 100227) |
| |
|
14750*100331 (invoice date 05/19/08 invoice# 100331) |
|
|
If we needed to search for a particular invoice#, it might be a bit cumbersome, as we can not use the LOCATE statement, as the LOCATE statement will look for an exact match to the string that we are searching with.
One method might be to:
1) determine the number of multivalues 2) loop through the multivalues parsing apart the invoice date and invoice number 3) compare each invoice# in the multivalued field to the invoice# that we are searching for
This would work, or we can use the INDEX/DCOUNT combo. But, if I may be so bold as to quote Lee Burstein "Know your data."
In our example, we know that our values in the multi-valued field are the invoice date in internal format concatenated to the invoice number with an asterisk, and in sorted order by the invoice date. And we know this NOT by looking at this individual record, but by knowing that this is how the application software is building the data.
|
| Where the invoice# to be found is 100227 and in variable INVOICE.NO |
|
| |
|
| POS = INDEX(CUST.REC<7>,INVOICE.NO,1) |
POS wil = 20 which is the character position of the start |
| |
of the invoice# in the multivalued string |
| |
14701*100120]14722*100227]14750*100331 |
| |
20th character |
| |
|
| MV.POS = DCOUNT(CUST.REC<7>[1,POS],CHAR(253)) |
The DCOUNT will count the number of values separated |
| |
by the multivalue mark (ASCII 253) starting with the 1st |
| |
through the 20th character of the string |
| |
14701*100120]14722*1 |
| |
MV.POS will equal 2 |
| |
|
|
For those of you familiar with RPL (Realtime Programming Language), a product of SMI (Systems Management Inc., or Realtime Software), this is similar to the MVL statement with a wild card.
Note: the INDEX/DCOUNT combo is only a suggested format of searching for a partial match in a multivalued field, as opposed to looping through all the values and comparing each value. This technique may not apply to your data structure and should never be used when LOCATE statements would suffice. And as always, "Know your data."
If the multivalued field did NOT include the invoice date, and was only a multivalued field of invoice numbers, the best method of searching would be to use the LOCATE statement as it searches the attribute looking at each multi value to find the multi-value that you are searching for.
LOCATE(INVOICE.NO,CUST.REC,7;MV.POS;'AR') THEN * found it END ELSE * returns the position it should be in if the value were to be inserted * and this example is using the AR, ascending right for the sort sequence END
|
|