UCF Research Cyberinfrastructure (RCI)

   Home > Data Privacy Compliance

Data Privacy Compliance

To learn more, visit UCF's University Compliance, Ethics, and Risk website by clicking here.

Securing and Protecting Research Data

Less information

Privacy and the protection of personal information are considered a fundamental right and therefore must be protected. Confidential information involving the university, often falling into the categories of highly restricted or restricted data, must also be protected. Hence, researchers are required to secure and protect information considered confidential and/or personal. Information related to business, financial, medical transactions, or that simply identifies individuals must be protected from loss, misuse, along with unauthorized access and modification. Examples of these types of data include an individual's name, address, date of birth, telephone number, social security number, personal photograph or fingerprint(s), amounts paid or charged in financial transactions, and account numbers. There are many other data types as well.

The Family Educational Rights and Privacy Act (FERPA) permits the release (and use) of student directory information, which includes name, current mailing address, telephone number, date of birth, major, dates of attendance, enrollment status, degrees/awards received, participation in officially-recognized activities and sports, and athlete's height/weight, provided a student has not requested restriction of release.

Acquiring and using protected health information (PHI) for human subject research involving US citizens requires compliance with the Health Insurance Portability and Accountability Act (HIPAA) and may mandate the use of a HIPAA Authorization Form.

When possible, researchers working with human subjects should avoid collecting personal information. If personal information must be collected, however, it is imperative to limit the types of information collected and associates them with one or more personal identifiers. A personal identifier, or pseudonym, is a subset of personal information which identifies a unique individual but must be cross-referenced with personal information in order to do so. For example, "Subject 00A", by itself has no meaning yet, when linked to actual personal information, identifies a unique individual. By separately storing an individual's actual name and other personal information, linking that information to a personal identifier, and then using the identifier within your research, you can effectively protect individuals' personal information by only revealing their personal identifier within published research.

Confidential and/or personal information should only be kept on university-owned media, such as computers and servers. If highly restricted data are involved, that information must be encrypted using current, industry standard software and have access to it limited to only those individuals directly involved with the research. Highly restricted data is never to be stored on mobile devices, such as laptops, smartphones, USB drives, and similar not owned by the university. If stored online, highly-restricted data may only appear within UCF-sanctioned internet cloud data storage systems with access limited to those directly involved with the research. Similarly, if restricted data are involved, that information may be sorted on workstations or mobile devices that use file level or full disk encryption and only on cloud data storage systems sanctioned by the university.

When storing a list of pseudonyms (personal identifiers), it's best to do so separate from personal information and on a UCF-sanctioned storage device or cloud service.

Personal Data

Less information

'Personal data' means: any information relating to identified or identifiable natural persons referred to as 'data subjects'.

'Identifiable persons' can be identified:

  • directly, or
  • indirectly, by reference to an identification number or to one or more factors specific to their physical, psychological, genetic, mental, economic, cultural, or social identity.


Less information

What is 'processing'?

'Processing of personal data' means: any operation or set of operations which is performed upon personal data, whether or not by automatic means, such as collection, recording, organization, storage, analysis, adaptation or alteration, retrieval, consultation, use, disclosure, transmission, sharing, dissemination or otherwise making available, alignment or combination, blocking, erasure, or destruction.

Whenever you process personal data, keep in mind that the processing must be necessary and proportionate in relation to:

  1. What?
  2. Why?
  3. How?
  4. For how long?

Sensitive Data

Less information

'Sensitive data' are those revealing racial or ethnic origin, political opinions, philosophical or religious beliefs, trade-union membership, genetic data, biometric data, data concerning health or relating to sexual orientation or activity.

Examples of sensitive data:

  • criminal records or information about legal investigations
  • membership in a religious or political group
  • sexual orientation
  • health-related records (e.g., patient records, biometric data, medical photographs, diet information, hospital information records, biological traits, and genetic material)
  • localization data such as visas, residence, GPS satellite localization recordings or other geographic recordings

As a rule, the processing of sensitive data is prohibited. However, Article 9 of the General Data Protection Regulation (GDPR) provides for specific circumstances, which allow for the processing for 'data subjects' in the European Economic Area (EEA).

Tips Involving Data Transfer:

Less information

  • If data processing is outsourced, remove personal data where practicable and as much as possible, so that only pseudonymous ID numbers are used to link individual-level data with participants' identities.
  • Assess the level of protection afforded by a third country [not a member of the European Economic Area (EEA)] or international organization in the light of all circumstances surrounding a data transfer operation or set of data transfer operations.

Anonymization and Pseudonymization

Less information

Anonymization techniques make data subjects unidentifiable.

One of the big advantages of anonymization is to allow research that would not otherwise be possible due to privacy concerns.

Pseudonymization still makes the data subject identifiable, through the combination of the pseudonym (e.g. key-code, code number) and additional identifiers. The time and the effort required to identify the individual, as well as the available technologies, are decisive for determining whether it is possible to identify the data subject from pseudonymized data.

Anonymization excludes any possibility to identify the data subject.

Example: Collecting immigrants' data on their immigration experiences could lead to an added value in research on irregularities or patterns involving migration but could also seriously infringe upon people's privacy and put them at risk of prosecution by the authorities, as well as persecution by human smugglers.

One possible solution would be to remove direct identifiers, such as names, birth dates, and addresses, although this might not be sufficient to avoid that the data can be traced back to individuals.

An effective anonymization solution prevents all parties from singling out an individual in a dataset by:

  • linking several records within a dataset (or between several separate datasets)
  • inferring any information in such dataset

It's worth emphasizing that pseudonymization still makes the data subject identifiable, through the combination of the pseudonym (e.g. key-code, code number) and additional identifiers. The time and effort required to identify the individual, as well as the available technologies, are important for determining whether it is possible to identify the data subject from within pseudonymized data.

Anonymization excludes any possibility to identify the data subject, however it is much more difficult to achieve and sometimes renders less valuable research results.

While anonymization is preferred, from a data protection perspective, pseudonymization is most often acceptable.